Databricks Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.

Deep dive into the quiz with AI chat providers.

We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.

A data engineer has joined an existing project and they see the following query in the project repository:

CREATE STREAMING LIVE TABLE loyal_customers AS SELECT customer_id -
FROM STREAM(LIVE.customers) WHERE loyalty_level = 'high';

CREATE STREAMING LIVE TABLE loyal_customers AS SELECT customer_id -
FROM STREAM(LIVE.customers) WHERE loyalty_level = 'high';

Which of the following describes why the STREAM function is included in the query?

Real Exam

Community

KKeng

Last updated: January 13, 2026 at 09:01

The STREAM function is not needed and will cause an error.

The table being created is a live table.

The customers table is a streaming live table.

The customers table is a reference to a Structured Streaming query on a PySpark DataFrame.

The data in the customers table has been updated since its last run.

Explanation:

The STREAM function is used in Databricks SQL to indicate that the source table is a streaming live table. When you use STREAM(LIVE.customers), it means that the customers table is a streaming live table, and you're reading from it as a streaming source. This is necessary when creating a streaming live table that depends on another streaming live table as its source.

Key points:

STREAM() function is used to read from streaming live tables
The query creates a new streaming live table (CREATE STREAMING LIVE TABLE)
The source customers table must be a streaming live table for this syntax to work
This enables incremental processing where the new table will update as new data arrives in the source table

Option A is incorrect because the STREAM function is needed and valid. Option B is incorrect because while the table being created is a live table, that doesn't explain why STREAM is needed. Option D is incorrect because the STREAM function refers to a streaming live table, not a PySpark DataFrame reference. Option E is incorrect because the STREAM function is not related to whether data has been updated.

Powered ByGPT-5.2

Comments

Loading comments...