
Ultimate access to all questions.
No comments yet.
A company has purchased provisioned throughput for the Anthropic Claude v2 model in Amazon Bedrock to ensure consistent performance during business hours. The application uses the InvokeModel API with the following code snippet:
response = bedrock_client.invoke_model(
modelId="anthropic.claude-v2",
body=json.dumps({
"prompt": prompt,
"max_tokens_to_sample": 300
})
)
response = bedrock_client.invoke_model(
modelId="anthropic.claude-v2",
body=json.dumps({
"prompt": prompt,
"max_tokens_to_sample": 300
})
)
During peak hours, the application experiences throttling. CloudWatch metrics show that the provisioned throughput capacity is not being utilized. Which change should be made to ensure the application uses the purchased provisioned throughput?
A
Increase the provisioned throughput capacity to match peak demand.
B
Replace the modelId value with the provisioned throughput ARN.
C
Implement exponential backoff and retry logic in the application.
D
Modify the application to use the InvokeModelWithResponseStream API instead of the InvokeModel API.