
Answer-first summary for fast verification
Answer: Integrate Spark Streaming with an external machine learning service that continuously updates its model and thresholds, querying it for each batch of streaming data.
The most suitable approach for designing an anomaly detection system in Spark Structured Streaming that adapts thresholds based on machine learning models and enables real-time model retraining and threshold adjustment is to integrate Spark Streaming with an external machine learning service. This service continuously updates its model and thresholds, allowing the system to query it for each batch of streaming data. This method ensures real-time adaptability to new data and anomalies without manual intervention or periodic batch processing, making it efficient and effective for dynamic data environments.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
How can you design an anomaly detection system in Spark Structured Streaming that adapts thresholds based on machine learning models, enabling real-time model retraining and threshold adjustment?
A
Use foreachBatch to write streaming data to a Delta Lake table, periodically pausing the stream to retrain the model and update thresholds.
B
Implement a continuous learning paradigm where streaming data is used to retrain models on a separate Spark cluster, updating the detection thresholds dynamically.
C
Batch process streaming data periodically, retrain the model offline, and update broadcast variables with new thresholds.
D
Integrate Spark Streaming with an external machine learning service that continuously updates its model and thresholds, querying it for each batch of streaming data.
No comments yet.