
Answer-first summary for fast verification
Answer: Define data quality rules using AWS Glue DataBrew by creating a new project, selecting the order records dataset, and specifying rules to identify and resolve data discrepancies related to delivery times.
Option C is the correct answer. To ensure the data quality of the order records dataset, you should define data quality rules using AWS Glue DataBrew. By creating a new project, selecting the dataset, and specifying rules to identify and resolve data discrepancies related to delivery times, you can maintain the integrity of the order records. Manually inspecting each order record (Option A) is not efficient for large datasets. Writing custom scripts (Option B) can be time-consuming and may not cover all possible data quality issues. Ignoring data quality checks (Option D) is not recommended as it can lead to poor data quality and incorrect analysis.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
Your team is working on a data pipeline that processes data from a supply chain management system. The data includes order records with information about product shipments and delivery times. You have been tasked with ensuring the data quality of the order records dataset. Describe the steps you would take to run data quality checks on the order records dataset and explain how you would define data quality rules to identify and resolve data discrepancies related to delivery times.
A
Run data quality checks by manually inspecting each order record and identifying discrepancies in delivery times.
B
Use AWS Glue to run data quality checks by writing custom scripts that identify discrepancies in delivery times based on specific conditions.
C
Define data quality rules using AWS Glue DataBrew by creating a new project, selecting the order records dataset, and specifying rules to identify and resolve data discrepancies related to delivery times.
D
Ignore data quality checks and assume the delivery times are accurate.
No comments yet.