
Answer-first summary for fast verification
Answer: Define data quality rules using AWS Glue DataBrew by creating a new project, selecting the property listings dataset, and specifying rules to ensure the data is accurate and up-to-date.
Option C is the correct answer. To ensure the data quality of the property listings dataset, you should define data quality rules using AWS Glue DataBrew. By creating a new project, selecting the dataset, and specifying rules to ensure the data is accurate and up-to-date, you can maintain the integrity of the property listings. Manually inspecting each property listing (Option A) is not efficient for large datasets. Writing custom scripts (Option B) can be time-consuming and may not cover all possible data quality issues. Ignoring data quality checks (Option D) is not recommended as it can lead to poor data quality and incorrect analysis.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
Your team is working on a data pipeline that processes data from a real estate company. The data includes property listings with information about property features and prices. You have been tasked with ensuring the data quality of the property listings dataset. Describe the steps you would take to run data quality checks on the property listings dataset and explain how you would define data quality rules to ensure the data is accurate and up-to-date.
A
Run data quality checks by manually inspecting each property listing and identifying any outdated or inaccurate information.
B
Use AWS Glue to run data quality checks by writing custom scripts that identify outdated or inaccurate information in the property listings.
C
Define data quality rules using AWS Glue DataBrew by creating a new project, selecting the property listings dataset, and specifying rules to ensure the data is accurate and up-to-date.
D
Ignore data quality checks and assume the data is accurate and up-to-date.
No comments yet.