
Ultimate access to all questions.
Deep dive into the quiz with AI chat providers.
We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.
A company needs to ingest and handle large amounts of streaming data that its application generates. The application runs on Amazon EC2 instances and sends data to Amazon Kinesis Data Streams, which is configured with default settings. Every other day, the application consumes the data and writes the data to an Amazon S3 bucket for business intelligence (BI) processing. The company observes that Amazon S3 is not receiving all the data that the application sends to Kinesis Data Streams.
What should a solutions architect do to resolve this issue?
A
Update the Kinesis Data Streams default settings by modifying the data retention period.
B
Update the application to use the Kinesis Producer Library (KPL) to send the data to Kinesis Data Streams.
C
Update the number of Kinesis shards to handle the throughput of the data that is sent to Kinesis Data Streams.
D
Turn on S3 Versioning within the S3 bucket to preserve every version of every object that is ingested in the S3 bucket.
Explanation:
The correct answer is C because:
Kinesis Data Streams Capacity: Kinesis Data Streams has throughput limits based on the number of shards. Each shard provides:
Default Settings Issue: With default settings, the stream may have insufficient shards to handle the incoming data volume, leading to:
Root Cause Analysis: The problem states that S3 is not receiving all the data sent to Kinesis. This indicates that data is being lost at the Kinesis level, not at the S3 level. The data retention period (Option A) doesn't affect data ingestion - it only affects how long data is stored in Kinesis. S3 Versioning (Option D) would preserve multiple versions of objects but doesn't address the missing data issue.
KPL vs Shard Scaling: While KPL (Option B) can help with batching and retries, it doesn't solve the fundamental throughput limitation. If the stream doesn't have enough shards, even KPL will face throttling.
Solution: Increasing the number of shards allows Kinesis to handle higher throughput, ensuring all data from the EC2 instances is properly ingested and subsequently delivered to S3 for BI processing.
Best Practice: Monitor Kinesis metrics like WriteProvisionedThroughputExceeded and ReadProvisionedThroughputExceeded to determine when to scale shards. Consider using Kinesis Data Streams On-Demand mode for automatic scaling if the workload is unpredictable.