
Answer-first summary for fast verification
Answer: Replace the `.read` line with `.readStream`.
The correct modification to enable streaming read in the code block is to replace the `.read` line with `.readStream`. This is because: - **Structured Streaming in Apache Spark** is designed for processing data streams. - **`readStream` vs. `read`**: The `readStream` method is specifically for creating a DataFrame that represents a stream of data, whereas `read` is used for static datasets. - The rest of the code block is correctly set up for configuring the cloud storage format and schema for streaming data. By making this change, the code will correctly process incoming data as a stream.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
A data engineer has developed a code block intended to perform a streaming read on a data source, but it's returning an error. The code block is as follows:
spark.read.schema(schema).format("cloudFiles").option("cloudFiles.format", "json").load(dataSource)
Which of the following modifications will correctly configure the block to perform a streaming read?
A
Insert a .stream line immediately after the spark line.
B
Replace the .read line with .readStream.
C
Add a .stream line right after the .read line.
D
Place a .stream line following the .load(dataSource) line.
E
Change the .format("cloudFiles") line to .format("streamingSource").