Databricks Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.

Deep dive into the quiz with AI chat providers.

We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.

Which of the following queries is performing a streaming hop from raw data to a Bronze table?

Real Exam

Community

KKeng

Last updated: January 13, 2026 at 09:15

(spark.table("sales")
 .groupBy("store")
 .agg(sum("sales"))
 .writeStream
 .option("checkpointLocation", checkpointPath)
 .outputMode("complete")
 .table("newSales")
)```

(spark.table("sales")
 .groupBy("store")
 .agg(sum("sales"))
 .writeStream
 .option("checkpointLocation", checkpointPath)
 .outputMode("complete")
 .table("newSales")
)```

(spark.table("sales")
 .filter(col("units") > 0)
 .writeStream
 .option("checkpointLocation", checkpointPath)
 .outputMode("append")
 .table("newSales")
)```

(spark.table("sales")
 .filter(col("units") > 0)
 .writeStream
 .option("checkpointLocation", checkpointPath)
 .outputMode("append")
 .table("newSales")
)```

(spark.table("sales")
 .withColumn("avgPrice", col("sales") / col("units"))
 .writeStream
 .option("checkpointLocation", checkpointPath)
 .outputMode("append")
 .table("newSales")
)```

(spark.table("sales")
 .withColumn("avgPrice", col("sales") / col("units"))
 .writeStream
 .option("checkpointLocation", checkpointPath)
 .outputMode("append")
 .table("newSales")
)```

(spark.read.load(rawSalesLocation)
 .write
 .mode("append")
 .table("newSales")
)```

(spark.read.load(rawSalesLocation)
 .write
 .mode("append")
 .table("newSales")
)```

(spark.readStream.load(rawSalesLocation)
 .writeStream
 .option("checkpointLocation", checkpointPath)
 .outputMode("append")
 .table("newSales")
)```

(spark.readStream.load(rawSalesLocation)
 .writeStream
 .option("checkpointLocation", checkpointPath)
 .outputMode("append")
 .table("newSales")
)```

Explanation:

Explanation

Option E is the correct answer because it demonstrates a streaming hop from raw data to a Bronze table. Here's why:

Key Characteristics of a Streaming Hop:

Reads from raw data source - Uses spark.readStream.load(rawSalesLocation) which reads streaming data from a raw location
Writes to a table - Uses .table("newSales") to write to a Bronze table
Uses streaming API - Both readStream and writeStream indicate streaming processing
Includes checkpointing - Has .option("checkpointLocation", checkpointPath) for fault tolerance
Append mode - Uses .outputMode("append") which is typical for streaming writes to tables

Why Other Options Are Incorrect:

Option A, B, C: These all read from spark.table("sales") which means they're reading from an existing table (not raw data), and performing transformations before writing. This represents a transformation hop rather than a raw-to-bronze hop.

Option D: This uses batch processing (spark.read.load and .write) rather than streaming, so it's not a streaming hop.

Bronze Table Concept:

In the Databricks Lakehouse architecture:

Raw/Bronze layer: Contains raw, unprocessed data from source systems
Silver layer: Contains cleaned, validated, and enriched data
Gold layer: Contains aggregated, business-ready data

A streaming hop from raw data to Bronze typically involves reading streaming data from sources (like Kafka, Event Hubs, files) and writing it directly to Bronze tables with minimal transformation, which is exactly what Option E demonstrates.

Powered ByGPT-5.2

Comments

Loading comments...