
Explanation:
Option A is the correct solution because it provides proactive, model-aware token management with fine-grained visibility and alerting, which is required for regulated financial workloads. Amazon Bedrock currently exposes token usage metrics after invocation, but it does not natively enforce proactive, model-specific token limits across multiple applications or business units. By implementing model-specific tokenizers in AWS Lambda, the company can estimate input and output token usage before sending requests to Amazon Bedrock. This enables early detection of requests that are approaching or exceeding model limits and allows the application to block, truncate, or reroute requests proactively rather than reacting to failures. Publishing token usage metrics to Amazon CloudWatch enables real-time monitoring and alerting at scale, easily supporting more than 5,000 requests per minute. Storing detailed token usage data in Amazon DynamoDB allows the company to attribute usage and costs to specific applications, teams, or business units—an essential requirement for regulatory reporting and internal chargeback. Option B is incorrect because Amazon Bedrock Guardrails do not currently provide token quota enforcement or proactive token alerts. Option C is reactive and only analyzes failures after they occur. Option D throttles requests but cannot enforce token-based limits or provide per-model cost attribution. Therefore, Option A best satisfies proactive alerting, scalability, compliance reporting, and cost allocation requirements with acceptable operational effort.
Ultimate access to all questions.
No comments yet.
A financial services company uses multiple foundation models (FMs) through Amazon Bedrock for its generative AI (GenAI) applications. To comply with a new regulation for GenAI use with sensitive financial data, the company needs a token management solution. \n\nThe token management solution must proactively alert when applications approach model-specific token limits. The solution must also process more than 5,000 requests each minute and maintain token usage metrics to allocate costs across business units. \n\nWhich solution will meet these requirements?
A
Develop model-specific tokenizers in an AWS Lambda function. Configure the Lambda function to estimate token usage before sending requests to Amazon Bedrock. Configure the Lambda function to publish metrics to Amazon CloudWatch and trigger alarms when requests approach thresholds. Store detailed token usage in Amazon DynamoDB to report costs.
B
Implement Amazon Bedrock Guardrails with token quota policies. Capture metrics on rejected requests. Configure Amazon EventBridge rules to trigger notifications based on Amazon Bedrock Guardrails metrics. Use Amazon CloudWatch dashboards to visualize token usage trends across models.
C
Deploy an Amazon SQS dead-letter queue for failed requests. Configure an AWS Lambda function to analyze token-related failures. Use Amazon CloudWatch Logs Insights to generate reports on token usage patterns based on error logs from Amazon Bedrock API responses.
D
Use Amazon API Gateway to create a proxy for all Amazon Bedrock API calls. Configure request throttling based on custom usage plans with predefined token quotas. Configure API Gateway to reject requests that will exceed token limits.
E
None of the above
F
All of the above