
Ultimate access to all questions.
Question 31
Which of the following Structured Streaming queries is performing a hop from a Bronze table to a Silver table?
B
(spark.table("sales")
.agg(sum("sales"), sum("units"))
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("complete")
.table("aggregatedSales"))
(spark.table("sales")
.agg(sum("sales"), sum("units"))
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("complete")
.table("aggregatedSales"))
C
(spark.table("sales")
.withColumn("avgPrice", col("sales") / col("units"))
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("append")
.table("cleanedSales"))
(spark.table("sales")
.withColumn("avgPrice", col("sales") / col("units"))
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("append")
.table("cleanedSales"))
D
(spark.readStream.load(rawSalesLocation)
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("append")
.table("uncleanedSales"))
(spark.readStream.load(rawSalesLocation)
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("append")
.table("uncleanedSales"))
E
(spark.read.load(rawSalesLocation)
.writeStream
(spark.read.load(rawSalesLocation)
.writeStream
Next
Explanation:
In the Databricks medallion architecture:
Analysis of each option:
Option A & B: These perform aggregations (groupBy and agg) which are typically associated with creating Gold tables from Silver tables, not Bronze to Silver transformations.
Option C: This query performs data cleaning/enrichment by calculating avgPrice using withColumn(col("sales") / col("units")) and writes to a table named cleanedSales. This represents the typical Bronze to Silver transformation where raw data is cleaned and enriched.
Option D: This writes to a table named uncleanedSales, indicating it's still in the Bronze stage (raw data).
Option E: This is an incomplete query and doesn't represent a valid transformation.
The key indicators that Option C represents a Bronze to Silver hop are:
withColumn operation)cleanedSales (suggesting cleaned data)append output mode (appropriate for incremental data processing)Therefore, Option C correctly represents moving from Bronze (raw data) to Silver (cleaned, processed data).