
Answer-first summary for fast verification
Answer: The schema will be evolved but the stream will fail.
The correct option is E because when using Auto Loader with cloudFiles.schemaEvolutionMode set to "addNewColumns", Databricks will detect new columns in incoming data, update the schema to include them, but the stream will fail with an error (such as UnknownFieldException). You must then restart the stream for it to continue processing with the updated schema. This behavior ensures you are aware of schema changes and can handle them appropriately. Reference: https://docs.databricks.com/aws/en/ingestion/cloud-object-storage/auto-loader/schema#how-does-auto-loader-schema-evolution-work
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
When using Auto Loader to ingest JSON files from a cloud location with the following configuration:
.schema(schema) \
.option("cloudFiles.format", "json") \
.option("cloudFiles.schemaEvolutionMode", "addNewColumns") \
.load(source)
.schema(schema) \
.option("cloudFiles.format", "json") \
.option("cloudFiles.schemaEvolutionMode", "addNewColumns") \
.load(source)
What happens if a file with an added column arrives at the source location?
A
The stream will fail and the new column is added to _rescued_data column.
B
The schema will evolve and the stream will continue to run.
C
The stream will fail and the schema will not be evolved.
D
The stream will continue and the new column will be ignored.
E
The schema will be evolved but the stream will fail.