
Ultimate access to all questions.
In the context of Spark Structured Streaming, you are tasked with designing a solution to handle out-of-order events for a real-time analytics application. The solution must efficiently process events based on their actual occurrence time, manage late-arriving data, and ensure scalability under high data volumes. Considering these requirements, which of the following approaches BEST addresses the challenge of out-of-order events? Choose one option.
A
Implementing processing time-based windowing, which processes events based on when they arrive at the system, ignoring their event timestamps.
B
Utilizing event-time processing with watermarks to handle out-of-order events by processing them according to their timestamps and defining a threshold for late data.
C
Discarding all events that arrive out of order to maintain processing efficiency and simplicity.
D
Buffering all incoming events and sorting them by their timestamps before processing, regardless of the delay this introduces.