Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
How can you manage the state size of a Spark Structured Streaming application with stateful processing that grows indefinitely over time to prevent resource exhaustion?
A
Configure the streaming query to restart periodically, thereby resetting the state store and preventing unbounded growth.
B
Use the state operator to explicitly define state storage level as MEMORY_ONLY_SER, forcing old state data to be serialized and stored on disk.
C
Implement state timeout logic using mapGroupsWithState or flatMapGroupsWithState and specify a timeout duration to purge old state data.
D
Regularly checkpoint the streaming state to an external durable store and manually truncate the state store at intervals.