
Ultimate access to all questions.
In a scenario where you are processing sensor data from autonomous vehicles using Spark Structured Streaming, how would you design your Spark job to handle schema drift and process time series data efficiently? Additionally, describe how you would manage data across partitions and within one partition to ensure optimal performance.
A
Use Spark Structured Streaming to read data, ignore schema drift, and process data only within one partition.
B
Configure Spark Structured Streaming to read data, handle schema drift with schema evolution, and use repartitioning to manage data across partitions.
C
Set up Spark Structured Streaming to read data, ignore partitions, and use a fixed schema without handling changes.
D
Use Spark Structured Streaming to read data, focus only on schema drift, and ignore data partitioning.