
Ultimate access to all questions.
In a data transformation project, you identify a significant amount of duplicate data that needs to be resolved. Describe the steps you would take to identify and resolve these duplicates, including the tools and techniques you would use.
A
Ignore duplicates as they do not affect the overall data analysis.
B
Identify duplicates using SQL queries or data profiling tools, then resolve by either removing duplicates or merging them based on specific criteria.
C
Delete the entire dataset if duplicates are found.
D
Mark duplicates without removing or merging them to keep all data intact.