
Answer-first summary for fast verification
Answer: cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema
The correct answer is `cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema`. This configuration ensures that the data is read in the correct format, schema changes are automatically managed by storing the inferred schema and subsequent changes in the specified location, and the stream's progress is checkpointed for fault tolerance. The `mergeSchema` option is crucial for automatically handling schema evolution over time.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
At the end of the inventory process, a file is uploaded to cloud object storage. You are tasked with building a process to ingest data incrementally, with the schema expected to change over time. The ingestion process should automatically handle these schema changes. Fill in the blanks in the following Auto Loader command to ensure successful execution:
spark.readStream.format(“cloudfiles“).option(“_______“,”csv).option(“_______“, ‘dbfs:/location/checkpoint/’).load(data_source).writeStream.option(“_______“,’ dbfs:/location/checkpoint/’).option(“_______“, “true“).table(table_name))
A
format, checkpointlocation, schemalocation, overwrite
B
cloudfiles.format, checkpointlocation, cloudfiles.schemalocation, overwrite
C
cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema
D
cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, overwrite
E
cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, append
No comments yet.