
Ultimate access to all questions.
Deep dive into the quiz with AI chat providers.
We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.
A payment processing company records all voice communication with its customers and stores the audio files in an Amazon S3 bucket. The company needs to capture the text from the audio files. The company must remove from the text any personally identifiable information (PII) that belongs to customers.
What should a solutions architect do to meet these requirements?
A
Process the audio files by using Amazon Kinesis Video Streams. Use an AWS Lambda function to scan for known PII patterns.
B
When an audio file is uploaded to the S3 bucket, invoke an AWS Lambda function to start an Amazon Transcribe task to analyze the call recordings.
C
Configure an Amazon Transcribe transcription job with PII redaction turned on. When an audio file is uploaded to the S3 bucket, invoke an AWS Lambda function to start the transcription job. Store the output in a separate S3 bucket.
D
Create an Amazon Connect contact flow that ingests the audio files with transcription turned on. Embed an AWS Lambda function to scan for known PII patterns. Use Amazon EventBridge to start the contact flow when an audio file is uploaded to the S3 bucket.
Explanation:
Correct Answer: C
Amazon Transcribe has built-in PII redaction capabilities that can automatically identify and redact personally identifiable information (PII) such as names, addresses, credit card numbers, and other sensitive data. This is the most appropriate and efficient solution because:
Amazon Transcribe with PII Redaction: Amazon Transcribe offers PII redaction as a feature that can be enabled when creating transcription jobs. This automatically detects and redacts PII entities from the transcribed text.
Event-Driven Architecture: Using an AWS Lambda function triggered by S3 upload events ensures automatic processing when new audio files are added to the bucket.
Secure Storage: Storing the output in a separate S3 bucket follows security best practices by isolating processed data from raw audio files.
Why other options are incorrect:
A: Amazon Kinesis Video Streams is designed for real-time video and audio streaming, not batch processing of stored audio files. Additionally, using Lambda to scan for PII patterns would require custom pattern matching which is less reliable than Amazon Transcribe's built-in PII detection.
B: While this option uses Amazon Transcribe, it doesn't mention enabling PII redaction, which is a critical requirement. The transcription would capture all text including PII, violating the requirement to remove PII.
D: Amazon Connect is a contact center service designed for real-time customer interactions, not for batch processing of stored audio files. This solution is overly complex and not appropriate for processing existing audio files stored in S3.
Key AWS Services Used: