
Answer-first summary for fast verification
Answer: The streaming query fails upon receiving the next batch in `stock_prices` only if the `id` column is dropped or renamed in `company_info`.
The correct answer is D because the streaming query will only fail upon the arrival of the next batch in `stock_prices` if the `id` column in `company_info` is either dropped or renamed. This is due to the join operation relying on the `id` column. The query does not fail immediately upon schema change (A is incorrect), nor does it fail if the `id` column remains unchanged (B and C are partially correct but not fully accurate). The schema can indeed be changed during the query's execution (E is incorrect).
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
A data engineer executes the following query in Databricks: spark.readStream.format('delta').table('stock_prices').join(table('company_info'), how = 'left', on='id').writeStream.option('checkpointLocation', '/tmp/share_details').format('delta').table('shares'). What happens to the streaming query if the schema of the company_info table is altered by a data analyst?
A
The streaming query fails immediately upon schema change in the company_info table.
B
The streaming query fails upon receiving the next batch in stock_prices only if the id column is dropped from company_info.
C
The streaming query fails upon receiving the next batch in stock_prices.
D
The streaming query fails upon receiving the next batch in stock_prices only if the id column is dropped or renamed in company_info.
E
The schema of company_info cannot be altered while the streaming query is active.