
Answer-first summary for fast verification
Answer: Replace the model ID parameter with the ARN of the provisioned model that the CreateProvisionedModelThroughput API returns.
## Explanation The correct answer is **B** because: 1. **Current Issue**: The application is using the standard model ID (`anthropic.claude-v2`) which routes requests to on-demand capacity, not provisioned throughput. This explains why CloudWatch shows provisioned capacity is unused while on-demand requests are being throttled. 2. **Provisioned Throughput Usage**: When you create provisioned throughput using `CreateProvisionedModelThroughput`, it returns a provisioned model ARN. To use the provisioned capacity, you must: - Use the provisioned model ARN instead of the base model ID - Call `invoke_model` with the provisioned model ARN as the `modelId` parameter 3. **Why Other Options Are Incorrect**: - **A**: Increasing MUs won't help because the application isn't using the provisioned throughput at all. The provisioned capacity is already unused. - **C**: Exponential backoff retry logic would help with throttling but doesn't address the root cause - the application isn't using the provisioned throughput it paid for. 4. **Solution Implementation**: ```python # Get provisioned model ARN from CreateProvisionedModelThroughput response provisioned_model_arn = "arn:aws:bedrock:region:account:provisioned-model/provisioned-model-id" # Use the provisioned model ARN instead of base model ID response = bedrock_runtime.invoke_model(modelId=provisioned_model_arn, body=json.dumps(payload)) ``` This change ensures the application uses the provisioned throughput it purchased, resolving both the throttling issues and the inefficient use of resources.
Author: Ducse Chen
Ultimate access to all questions.
No comments yet.
A financial services company uses an AI application to process financial documents by using Amazon Bedrock. During business hours, the application handles approximately 10,000 requests each hour, which requires consistent throughput.
The company uses the CreateProvisionedModelThroughput API to purchase provisioned throughput. Amazon CloudWatch metrics show that the provisioned capacity is unused while on-demand requests are being throttled. The company finds the following code in the application:
response = bedrock_runtime.invoke_model(modelId="anthropic.claude-v2", body=json.dumps(payload))
response = bedrock_runtime.invoke_model(modelId="anthropic.claude-v2", body=json.dumps(payload))
The company needs the application to use the provisioned throughput and to resolve the throttling issues.
Which solution will meet these requirements?
A
Increase the number of model units (MUs) in the provisioned throughput configuration.
B
Replace the model ID parameter with the ARN of the provisioned model that the CreateProvisionedModelThroughput API returns.
C
Add exponential backoff retry logic to handle throttling exceptions during peak hours.