
Explanation:
The correct answer is D because the streaming query will only fail upon the arrival of the next batch in stock_prices if the id column in company_info is either dropped or renamed. This is due to the join operation relying on the id column. The query does not fail immediately upon schema change (A is incorrect), nor does it fail if the id column remains unchanged (B and C are partially correct but not fully accurate). The schema can indeed be changed during the query's execution (E is incorrect).
Ultimate access to all questions.
No comments yet.
A data engineer executes the following query in Databricks: spark.readStream.format('delta').table('stock_prices').join(table('company_info'), how = 'left', on='id').writeStream.option('checkpointLocation', '/tmp/share_details').format('delta').table('shares'). What happens to the streaming query if the schema of the company_info table is altered by a data analyst?
A
The streaming query fails immediately upon schema change in the company_info table.
B
The streaming query fails upon receiving the next batch in stock_prices only if the id column is dropped from company_info.
C
The streaming query fails upon receiving the next batch in stock_prices.
D
The streaming query fails upon receiving the next batch in stock_prices only if the id column is dropped or renamed in company_info.
E
The schema of company_info cannot be altered while the streaming query is active.