
Ultimate access to all questions.
You are a data engineer for a large e-commerce platform that processes millions of transactions daily. The company is expanding into new markets and needs to ensure that its data is accurate, consistent, and compliant with international data protection regulations. The dataset includes customer information, transaction details, and product inventory. Given the scale of data and the need for compliance, which of the following approaches would BEST implement a data cleansing process to meet these requirements? Choose one option.
A
Use a basic SQL script to identify and remove duplicate customer records without further analysis.
B
Develop a comprehensive data validation framework that includes checks for data type consistency, removal of outliers, correction of data entry errors, and validation against international data protection standards.
C
Automatically exclude any transaction records that do not match predefined criteria, without manual review or consideration for data recovery.
D
Assign a team to manually inspect each record for accuracy and compliance, despite the volume of data.