
Ultimate access to all questions.
Deep dive into the quiz with AI chat providers.
We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.
Which of the following Structured Streaming queries is performing a hop from a Bronze table to a Silver table?
A
(spark.table("sales")
.groupBy("store")
.agg(sum("sales"))
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("complete")
.table("aggregatedSales"))
(spark.table("sales")
.groupBy("store")
.agg(sum("sales"))
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("complete")
.table("aggregatedSales"))
B
(spark.table("sales")
.agg(sum("sales"), sum("units"))
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("complete")
.table("aggregatedSales"))
(spark.table("sales")
.agg(sum("sales"), sum("units"))
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("complete")
.table("aggregatedSales"))
C
(spark.table("sales")
.withColumn("avgPrice", col("sales") / col("units"))
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("append")
.table("cleanedSales"))
(spark.table("sales")
.withColumn("avgPrice", col("sales") / col("units"))
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("append")
.table("cleanedSales"))
D
(spark.readStream.load(rawSalesLocation)
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("append")
.table("uncleanedSales"))
(spark.readStream.load(rawSalesLocation)
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("append")
.table("uncleanedSales"))
E
(spark.read.load(rawSalesLocation)
.writeStream
(spark.read.load(rawSalesLocation)
.writeStream
Explanation:
In Databricks Lakehouse architecture, data typically flows through three layers:
Analyzing each option:
Option A & B: These perform aggregation operations (groupBy and agg) which are more typical of Silver-to-Gold transformations (aggregating cleaned data for analytics). They don't show data cleaning or transformation from raw format.
Option C: This query performs a data transformation by computing avgPrice (dividing sales by units), which represents data cleaning and enrichment. The output table name "cleanedSales" clearly indicates it's creating a Silver-level table from potentially raw data. The append output mode suggests incremental processing of cleaned data.
Option D: This reads from a raw location and writes to "uncleanedSales", suggesting it's still in Bronze layer (raw data).
Option E: Incomplete syntax, not a valid query.
Key indicators of Bronze-to-Silver transformation:
Option C demonstrates all these characteristics with the withColumn transformation and the "cleanedSales" output table name.