
Ultimate access to all questions.
In the context of Spark Structured Streaming, you are tasked with implementing a stream-static join to enrich streaming data with static reference data. The solution must handle late data efficiently without compromising the accuracy of the join results. Considering the need for scalability and the handling of out-of-order events, which of the following approaches is the BEST to implement a stream-static join? Choose the correct option from the four provided.
A
Use the 'join' function to join the streaming DataFrame with the static DataFrame without any additional configurations.
B
Use the 'join' function with a watermark to handle late data and specify the threshold for out-of-order events.
C
Use the 'join' function with a stateful aggregation to continuously update the join results, which may increase the computational overhead.
D
Use the 'join' function with a state timeout to handle state expiration, which might lead to incomplete join results for late data.