You are working on a data pipeline that processes data from a healthcare provider. The data includes patient records with sensitive information. You have been tasked with ensuring the data quality of the patient records dataset. Describe the steps you would take to run data quality checks on the patient records dataset and explain how you would define data quality rules to ensure the data is accurate and complete.

Simulated

Run data quality checks by manually inspecting each patient record and identifying any missing or incorrect information.

0.0%

Use AWS Glue to run data quality checks by writing custom scripts that identify missing or incorrect information in the patient records.

0.0%

Define data quality rules using AWS Glue DataBrew by creating a new project, selecting the patient records dataset, and specifying rules to ensure the data is accurate and complete.

100.0%

Ignore data quality checks and assume the data is accurate and complete.

0.0%

AWS Certified Data Engineer - Associate

Get started today

Comments