
Explanation:
In the Databricks Medallion Architecture:
A hop from Silver to Gold involves aggregation operations that transform data from detailed records to summarized business metrics.
Let's analyze each option:
Option A: Reads from rawSalesLocation (likely Bronze) and writes to newSales - this is Bronze to Silver.
Option B: Uses spark.read.load() instead of spark.readStream.load() - this is batch processing, not streaming.
Option C: Reads from sales table (Silver), performs a simple column transformation (avgPrice calculation), and writes to newSales - this is Silver to Silver transformation.
Option D: Reads from sales table (Silver), applies a filter, and writes to newSales - this is Silver to Silver transformation with filtering.
Option E: Reads from sales table (Silver), performs aggregation (groupBy("store") and agg(sum("sales"))), and writes to newSales - this is Silver to Gold transformation because it involves aggregation, which is characteristic of Gold table creation.
Key indicators of Silver to Gold hop:
outputMode("complete") which is common for aggregated streaming resultsThe correct answer is E because it demonstrates the transformation from detailed Silver data to aggregated Gold data through grouping and aggregation operations.
Ultimate access to all questions.
Which of the following Structured Streaming queries is performing a hop from a Silver table to a Gold table?
A
(spark.readStream.load(rawSalesLocation) .writeStream .option("checkpointLocation", checkpointPath) .outputMode("append") .table("newSales"))
B
(spark.read.load(rawSalesLocation) .writeStream .option("checkpointLocation", checkpointPath) .outputMode("append") .table("newSales"))
C
(spark.table("sales") .withColumn("avgPrice", col("sales") / col("units")) .writeStream .option("checkpointLocation", checkpointPath) .outputMode("append") .table("newSales"))
D
(spark.table("sales") .filter(col("units") > 0) .writeStream .option("checkpointLocation", checkpointPath) .outputMode("append") .table("newSales") )
E
(spark.table("sales") .groupBy("store") .agg(sum("sales")) .writeStream .option("checkpointLocation", checkpointPath) .outputMode("complete") .table("newSales") )
No comments yet.