
Answer-first summary for fast verification
Answer: Checkpointing and idempotent sinks
Structured Streaming achieves end-to-end fault tolerance through **Checkpointing and Idempotent Sinks**. - **Checkpointing** saves the current state of the streaming query, including input stream offsets, allowing the query to restart from the last checkpoint after a failure, ensuring no data loss. - **Idempotent Sinks** can process the same data multiple times without duplicating results, ensuring consistency even if data is reprocessed due to failures or retries. Other options are incorrect because: - **Watermarking** manages late data and triggers micro-batch processing but doesn't directly aid in fault tolerance. - **Write-ahead logging**, while useful in some systems, isn't a primary fault tolerance mechanism in Structured Streaming. - **Failover to available nodes** is part of fault tolerance but doesn't alone ensure data consistency or prevent loss without checkpointing and idempotent sinks.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
How does Structured Streaming ensure end-to-end fault tolerance?
A
Checkpointing and Watermarking
B
Write ahead logging and watermarking
C
Checkpointing and idempotent sinks
D
Write ahead logging and idempotent sinks
E
Stream will failover to available nodes in the cluster
No comments yet.