
Ultimate access to all questions.
No comments yet.
A company is building a global generative AI application using Amazon Bedrock. The application experiences high traffic during specific hours in different time zones, leading to throttling during peak usage. The company wants to ensure continuous operation without throttling while avoiding unnecessary spend during low-traffic periods. Which solution should the company implement?
A
Use provisioned throughput for the Amazon Bedrock model. Monitor the ProvisionedThroughputUtilization metric and adjust capacity based on usage patterns.
B
Implement a regional failover mechanism. Route traffic to a secondary AWS Region when throttling occurs in the primary Region.
C
Configure cross-Region inference in Amazon Bedrock. Monitor InvocationThrottles, InputTokenCount, and OutputTokenCount metrics.
D
Enable invocation logging in Amazon Bedrock. Monitor InvocationLatency, InvocationClientErrors, and InvocationServerErrors metrics. Distribute traffic across multiple versions of the same model.