
Ultimate access to all questions.
You are working on a data pipeline for a logistics company that tracks the location and status of shipments in real-time. The pipeline processes data from GPS devices, sensors, and manual updates. How would you handle late-arriving data and ensure accurate tracking?
A
Ignore late-arriving data and process only the data that arrives within the expected time window.
B
Use a batch processing approach to process the data in large batches, allowing for some flexibility in handling late-arriving data.
C
Implement watermarking and allow for a certain degree of data latency to handle late-arriving data and ensure accurate tracking.
D
Disable watermarking and rely on manual intervention to handle late-arriving data.