
Ultimate access to all questions.
A data engineer has joined an existing project and they see the following query in the project repository:
CREATE STREAMING LIVE TABLE loyal_customers AS SELECT customer_id -
FROM STREAM(LIVE.customers) WHERE loyalty_level = 'high';
CREATE STREAMING LIVE TABLE loyal_customers AS SELECT customer_id -
FROM STREAM(LIVE.customers) WHERE loyalty_level = 'high';
Which of the following describes why the STREAM function is included in the query?
A
The STREAM function is not needed and will cause an error.
B
The table being created is a live table.
C
The customers table is a streaming live table.
D
The customers table is a reference to a Structured Streaming query on a PySpark DataFrame.
E
The data in the customers table has been updated since its last run.
Explanation:
The STREAM function is used in Databricks SQL to indicate that the source table is a streaming live table. When you use STREAM(LIVE.customers), it means that the customers table is a streaming live table, and you're reading from it as a streaming source. This is necessary when creating a streaming live table that depends on another streaming live table as its source.
Key points:
STREAM() function is used to read from streaming live tablesCREATE STREAMING LIVE TABLE)customers table must be a streaming live table for this syntax to workOption A is incorrect because the STREAM function is needed and valid. Option B is incorrect because while the table being created is a live table, that doesn't explain why STREAM is needed. Option D is incorrect because the STREAM function refers to a streaming live table, not a PySpark DataFrame reference. Option E is incorrect because the STREAM function is not related to whether data has been updated.