
Answer-first summary for fast verification
Answer: Use Azure Databricks' built-in support for handling late data by specifying a watermark and using window operations.
Azure Databricks provides built-in support for handling late-arriving data, which is common in real-time data processing scenarios. By specifying a watermark, you can define how late the data can be and still be included in the processing. Additionally, using window operations allows you to manage data that arrives within a certain time frame after the watermark. This approach ensures that the data processing logic can accommodate late data without discarding it or waiting indefinitely.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
In a data processing scenario where you are using Azure Databricks to handle late-arriving data, what strategies would you consider to manage this issue?
A
Set a fixed time window for data processing and discard any data that arrives after the window.
B
Configure the system to wait indefinitely for late-arriving data.
C
Use Azure Databricks' built-in support for handling late data by specifying a watermark and using window operations.
D
Manually track the timestamps of incoming data and filter out any data that is outside the expected time range.
No comments yet.