Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
How can you ensure accurate results in a Spark Structured Streaming job that processes time-windowed aggregates when dealing with late-arriving data?
A
Increase the state timeout duration to an arbitrarily high value to account for all possible late data.
B
Manually adjust the system clock to account for data latency before processing each micro-batch.
C
Use the watermark feature to specify a threshold for late data and update window aggregates accordingly.
D
Ignore late data, focusing only on data arriving within the expected time window.