
Explanation:
Correct Answer: A
To meet a tight latency SLA (like < 10 seconds), the primary strategy is to reduce the trigger interval (e.g., .trigger(processingTime='5 seconds')). This forces the engine to fire smaller, more frequent micro-batches. Smaller batches are processed faster and are less likely to cause memory spillage, which often occurs when a single large batch exceeds executor memory during shuffles.
Why other options are incorrect:
Trigger.Once (or the modern Trigger.AvailableNow) is designed for batch-style processing. Scheduling a new job every 10 seconds introduces significant driver initialization overhead and is not an appropriate architectural choice for continuous, low-latency workloads.Ultimate access to all questions.
No comments yet.
A Structured Streaming job is failing to meet a 10-second latency SLA during peak hours. Which adjustment will most effectively ensure records are processed within the 10-second requirement?
A
Reduce the trigger interval to 5 seconds. This creates smaller, more frequent micro-batches, preventing backlog growth and reducing the risk of data spillage.
B
Extend the trigger interval to 30 seconds. By setting the interval closer to the maximum observed execution time, the engine ensures no records are dropped.
C
Decrease the trigger interval to 5 seconds. This allows idle executors to start processing the next micro-batch while longer tasks from the current batch finish.
D
Increase the number of shuffle partitions for maximum parallelism, as modifying the trigger interval requires the creation of a new checkpoint directory.
E
Utilize the Trigger.Once option and set up a Databricks job to execute the query every 10 seconds to process all backlogged records in a single batch.