
Answer-first summary for fast verification
Answer: cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema
The correct options to fill in the blanks are `cloudfiles.format`, `cloudfiles.schemalocation`, `checkpointlocation`, and `mergeSchema`. Here's why: - `cloudfiles.format`: Specifies the format of the data source. - `cloudfiles.schemalocation`: Automatically infers the schema from the data, crucial for handling schema changes over time. - `checkpointlocation`: Specifies the directory for checkpoint files, enabling recovery of the streaming query state in case of failures. - `mergeSchema`: Instructs Spark to merge the schema of new data with the existing table schema, allowing for incremental updates and schema evolution. This combination ensures a robust and flexible data ingestion pipeline capable of handling schema changes and maintaining data integrity.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
You are tasked with building a process to incrementally ingest data from a file uploaded to cloud object storage at the end of an inventory process. The schema of the file is expected to change over time, and the ingestion process should automatically handle these changes. Fill in the blanks in the following Auto Loader command to ensure successful execution:
spark.readStream
.format("cloudfiles")
.option("_______","csv")
.option("_______", 'dbfs:/location/checkpoint/')
.load(data_source)
.writeStream
.option("_______",' dbfs:/location/checkpoint/')
.option("_______", "true")
.table(table_name)
spark.readStream
.format("cloudfiles")
.option("_______","csv")
.option("_______", 'dbfs:/location/checkpoint/')
.load(data_source)
.writeStream
.option("_______",' dbfs:/location/checkpoint/')
.option("_______", "true")
.table(table_name)
A
format, checkpointlocation, schemalocation, overwrite
B
cloudfiles.format, checkpointlocation, cloudfiles.schemalocation, overwrite
C
cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema
D
cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, append
E
cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, overwrite