
Answer-first summary for fast verification
Answer: Use Amazon Textract to extract the text from the reports. Use Amazon Comprehend Medical to identify the PHI from the extracted text.
## Explanation **Correct Answer: C** **Why Option C is correct:** 1. **Amazon Textract** is specifically designed for extracting text from documents (PDF, JPEG, etc.) with high accuracy, handling both structured and unstructured documents. 2. **Amazon Comprehend Medical** is a specialized AWS service for identifying and extracting protected health information (PHI) and other medical information from text. It's pre-trained to recognize medical terminology, drug names, conditions, and PHI elements. 3. This combination provides a fully managed, serverless solution with minimal operational overhead - no need to train models, manage infrastructure, or maintain custom code. **Why other options are incorrect:** **Option A:** Using existing Python libraries would require significant development effort, maintenance, and may not handle the complexity of PHI detection accurately. This has the highest operational overhead. **Option B:** Amazon SageMaker requires building, training, and deploying custom machine learning models for PHI detection, which involves significant operational overhead for model development, training, and maintenance. **Option D:** Amazon Rekognition is primarily for image and video analysis, not optimized for document text extraction. While it can extract text from images, Textract is specifically designed for document processing and is more accurate for this use case. **Key AWS Services:** - **Amazon Textract:** Document text extraction - **Amazon Comprehend Medical:** PHI and medical entity detection - **Serverless architecture:** API Gateway + Lambda provides scalability and cost-efficiency **Architecture Flow:** 1. PDF/JPEG reports uploaded via API Gateway 2. Lambda triggers Textract for text extraction 3. Extracted text sent to Comprehend Medical for PHI detection 4. Results processed in Lambda function This solution minimizes operational overhead by leveraging fully managed AWS services that require no infrastructure management.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
A hospital recently deployed a RESTful API with Amazon API Gateway and AWS Lambda. The hospital uses API Gateway and Lambda to upload reports that are in PDF format and JPEG format. The hospital needs to modify the Lambda code to identify protected health information (PHI) in the reports.
Which solution will meet these requirements with the LEAST operational overhead?
A
Use existing Python libraries to extract the text from the reports and to identify the PHI from the extracted text.
B
Use Amazon Textract to extract the text from the reports. Use Amazon SageMaker to identify the PHI from the extracted text.
C
Use Amazon Textract to extract the text from the reports. Use Amazon Comprehend Medical to identify the PHI from the extracted text.
D
Use Amazon Rekognition to extract the text from the reports. Use Amazon Comprehend Medical to identify the PHI from the extracted text.