
Ultimate access to all questions.
You are currently managing a real-time data processing pipeline using Google Cloud's Dataflow, and you have configured hopping windows to aggregate the incoming data continuously. However, you have observed a significant issue: some data packets are arriving late, yet they are not being flagged as late data. This discrepancy is leading to inaccurate aggregation results in downstream processes. Your task is to identify a method to ensure that the late-arriving data gets accurately captured and assigned to the correct window. How should you address this problem?
A
Use watermarks to define the expected data arrival window. Allow late data as it arrives.
B
Change your windowing function to tumbling windows to avoid overlapping window periods.
C
Change your windowing function to session windows to define your windows based on certain activity.
D
Expand your hopping window so that the late data has more time to arrive within the grouping.