
Ultimate access to all questions.
A production Structured Streaming job must process records within a 10-minute SLA. The engineering team aims to minimize cloud storage and compute costs while meeting this requirement. Which configuration change should be implemented?
A
Set the trigger interval to 3 seconds; the default trigger interval consumes too many records per batch, causing disk spills and increased storage costs.
B
Increase the number of shuffle partitions to maximize parallelism, as the trigger interval cannot be modified once the checkpoint directory is established.
C
Set the trigger interval to 10 minutes within a continuous streaming query to minimize the frequency of API calls to the source storage account.
D
Set the trigger interval to 500 milliseconds; a non-zero interval ensures the source is not queried too frequently, reducing overhead.
E
Use the Trigger.Once (or AvailableNow) option and configure a Databricks job to execute the query every 10 minutes.