
Ultimate access to all questions.
As a Microsoft Fabric Analytics Engineer Associate, you are tasked with implementing a stored procedure in a lakehouse for data validation and cleaning on a dataset containing customer information. The dataset includes fields such as customer ID, name, email, and phone number. The solution must ensure data quality while considering cost efficiency and scalability. Which of the following approaches would you choose to implement a stored procedure that best meets these requirements? (Choose one option)
A
Implement a stored procedure that only checks for null values in each field and removes any rows with missing data, focusing on minimizing processing time.
B
Implement a stored procedure that validates the format of each field, such as email and phone number, and removes any rows with invalid data, ensuring data format consistency.
C
Implement a stored procedure that checks for duplicate customer IDs and removes any duplicate rows, prioritizing data uniqueness.
D
Implement a stored procedure that performs a combination of data validation and cleaning techniques, including checking for null values, validating formats, and removing duplicates, to ensure comprehensive data quality management.