AWS Certified Solutions Architect - Associate

Get started today

Ultimate access to all questions.

Explanation:

Explanation

Correct Answer: C

Why Option C is correct:

Amazon Textract is specifically designed for extracting text from documents (PDF, JPEG, etc.) with high accuracy, handling both structured and unstructured documents.
Amazon Comprehend Medical is a specialized AWS service for identifying and extracting protected health information (PHI) and other medical information from text. It's pre-trained to recognize medical terminology, drug names, conditions, and PHI elements.
This combination provides a fully managed, serverless solution with minimal operational overhead - no need to train models, manage infrastructure, or maintain custom code.

Why other options are incorrect:

Option A: Using existing Python libraries would require significant development effort, maintenance, and may not handle the complexity of PHI detection accurately. This has the highest operational overhead.

Option B: Amazon SageMaker requires building, training, and deploying custom machine learning models for PHI detection, which involves significant operational overhead for model development, training, and maintenance.

Option D: Amazon Rekognition is primarily for image and video analysis, not optimized for document text extraction. While it can extract text from images, Textract is specifically designed for document processing and is more accurate for this use case.

Key AWS Services:

Amazon Textract: Document text extraction
Amazon Comprehend Medical: PHI and medical entity detection
Serverless architecture: API Gateway + Lambda provides scalability and cost-efficiency

Architecture Flow:

PDF/JPEG reports uploaded via API Gateway
Lambda triggers Textract for text extraction
Extracted text sent to Comprehend Medical for PHI detection
Results processed in Lambda function

This solution minimizes operational overhead by leveraging fully managed AWS services that require no infrastructure management.

Explanation:

Explanation

Correct Answer: C

Why Option C is correct:

Amazon Textract is specifically designed for extracting text from documents (PDF, JPEG, etc.) with high accuracy, handling both structured and unstructured documents.
Amazon Comprehend Medical is a specialized AWS service for identifying and extracting protected health information (PHI) and other medical information from text. It's pre-trained to recognize medical terminology, drug names, conditions, and PHI elements.
This combination provides a fully managed, serverless solution with minimal operational overhead - no need to train models, manage infrastructure, or maintain custom code.

Why other options are incorrect:

Key AWS Services:

Amazon Textract: Document text extraction
Amazon Comprehend Medical: PHI and medical entity detection
Serverless architecture: API Gateway + Lambda provides scalability and cost-efficiency

Architecture Flow:

PDF/JPEG reports uploaded via API Gateway
Lambda triggers Textract for text extraction
Extracted text sent to Comprehend Medical for PHI detection
Results processed in Lambda function

This solution minimizes operational overhead by leveraging fully managed AWS services that require no infrastructure management.

Comments (0)

No comments yet.

A hospital recently deployed a RESTful API with Amazon API Gateway and AWS Lambda. The hospital uses API Gateway and Lambda to upload reports that are in PDF format and JPEG format. The hospital needs to modify the Lambda code to identify protected health information (PHI) in the reports.

Which solution will meet these requirements with the LEAST operational overhead?

Other

Community

UAnonymous

Last updated: February 23, 2026 at 11:39

Use existing Python libraries to extract the text from the reports and to identify the PHI from the extracted text.

Use Amazon Textract to extract the text from the reports. Use Amazon SageMaker to identify the PHI from the extracted text.

Use Amazon Textract to extract the text from the reports. Use Amazon Comprehend Medical to identify the PHI from the extracted text.

Use Amazon Rekognition to extract the text from the reports. Use Amazon Comprehend Medical to identify the PHI from the extracted text.