Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
How would you architect a real-time data processing pipeline using Databricks and Apache Spark Structured Streaming to efficiently handle fluctuating data volumes with minimal latency?
A
Rely exclusively on Apache Kafka for real-time data buffering and batch process the data at fixed intervals.
B
Manually adjust the cluster size based on expected data volume increases.
C
Utilize static batch processing intervals to manage data loads predictably.
D
Implement Spark Structured Streaming with dynamic scaling and rate limiting to adjust processing based on incoming data volumes.