
Answer-first summary for fast verification
Answer: Data Flow activity
Option B, the Data Flow activity, is the most suitable for handling duplicate data in Azure Synapse Pipelines. The Data Flow activity allows you to perform data transformation tasks, including deduplication. You can use the 'Deduplicate' transformation within the Data Flow activity to remove duplicate records based on specific columns. This ensures that only unique records are loaded into the destination dataset, maintaining data integrity and quality.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You are using Azure Synapse Pipelines to ingest and transform data from multiple sources. You have a source dataset that contains duplicate records, and you need to remove these duplicates before loading the data into the destination dataset. Which of the following Azure Synapse Pipeline activities would you use to handle duplicate data?
A
Copy Data activity
B
Data Flow activity
C
Execute Pipeline activity
D
Lookup activity
No comments yet.