
Explanation:
This scenario involves processing streaming sales transaction data in Azure Databricks using Structured Streaming, with the following key requirements:
Matches the business logic: The requirement that "new rows will be added to adjust a sale" directly corresponds to Append mode's behavior
Minimizes duplicates: Append mode only writes new incremental data, avoiding the duplication that would occur with Complete mode
Efficient for streaming: Append mode is designed for continuous addition of new records without modifying existing ones
Works with aggregations: When aggregations are applied in Structured Streaming, Append mode correctly handles the incremental results
Given that sales transactions are immutable and adjustments are made by adding new rows rather than updating existing ones, Append mode is the most appropriate output mode. It directly supports the business requirement while minimizing duplicate data and maintaining efficient streaming processing.
Ultimate access to all questions.
You are designing a streaming data solution using Azure Databricks to process sales transaction data from an online store. The solution has the following requirements:
You need to recommend a Structured Streaming output mode for the processed dataset. The solution must minimize the presence of duplicate data.
What output mode should you recommend?

A
Update
B
Complete
C
Append
No comments yet.