Ultimate access to all questions.
You are working on a data pipeline that processes data from a retail company. The data includes inventory records with product information. You have been tasked with ensuring the data quality of the inventory records dataset. Describe the steps you would take to run data quality checks on the inventory records dataset and explain how you would define data quality rules to ensure the data is up-to-date and accurate.
Explanation:
Option C is the correct answer. To ensure the data quality of the inventory records dataset, you should define data quality rules using AWS Glue DataBrew. By creating a new project, selecting the dataset, and specifying rules to ensure the data is up-to-date and accurate, you can maintain the integrity of the inventory records. Manually inspecting each inventory record (Option A) is not efficient for large datasets. Writing custom scripts (Option B) can be time-consuming and may not cover all possible data quality issues. Ignoring data quality checks (Option D) is not recommended as it can lead to poor data quality and incorrect analysis.