
Answer-first summary for fast verification
Answer: Unlimited Retries, with 1 Maximum Concurrent Run
To ensure the reliability and correctness of Structured Streaming jobs in production, it's recommended to configure them with **Unlimited Retries** and **1 Maximum Concurrent Run**. This setup allows the query to restart on failure without overlapping runs, ensuring data consistency. Additionally, always use a new job cluster with the latest Spark version (or at least version 2.1) for recoverability after upgrades. Avoid setting a schedule or timeout, as streaming queries are designed to run indefinitely. Notifications can be set for failure alerts. [Reference](https://docs.databricks.com/structured-streaming/query-recovery.html#configure-structured-streaming-jobs-to-restart-streaming-queries-on-failure)
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
What is the recommended retry policy for production Structured Streaming jobs to ensure reliability and correctness?
A
No Retries, with Unlimited Concurrent Runs
B
Unlimited Retries, with 1 Maximum Concurrent Run
C
1 Retry, with 1 Maximum Concurrent Run
D
No Retries, with 1 Maximum Concurrent Run
E
Unlimited Retries, with Unlimited Concurrent Runs
No comments yet.