
Ultimate access to all questions.
Deep dive into the quiz with AI chat providers.
We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.
Which of the following queries is performing a streaming hop from raw data to a Bronze table?
A
(spark.table("sales")
.groupBy("store")
.agg(sum("sales"))
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("complete")
.table("newSales")
)
(spark.table("sales")
.groupBy("store")
.agg(sum("sales"))
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("complete")
.table("newSales")
)
B
(spark.table("sales")
.filter(col("units") > 0)
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("append")
.table("newSales")
)
(spark.table("sales")
.filter(col("units") > 0)
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("append")
.table("newSales")
)
C
(spark.table("sales")
.withColumn("avgPrice", col("sales") / col("units"))
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("append")
.table("newSales")
)
(spark.table("sales")
.withColumn("avgPrice", col("sales") / col("units"))
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("append")
.table("newSales")
)
D
(spark.read.load(rawSalesLocation)
.write
.mode("append")
.table("newSales")
)
(spark.read.load(rawSalesLocation)
.write
.mode("append")
.table("newSales")
)
E
(spark.readStream.load(rawSalesLocation)
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("append")
.table("newSales")
)
(spark.readStream.load(rawSalesLocation)
.writeStream
.option("checkpointLocation", checkpointPath)
.outputMode("append")
.table("newSales")
)
Explanation:
Option E is the correct answer because it demonstrates a streaming hop from raw data to a Bronze table. Here's why:
spark.readStream.load(rawSalesLocation) which reads streaming data from raw files.table("newSales") to write to a Delta table (Bronze layer)readStream and writeStream for continuous data ingestion.option("checkpointLocation", checkpointPath) for fault tolerance.outputMode("append") appropriate for streaming ingestionspark.table("sales")) rather than raw data, so they're not performing the initial ingestion from raw to Bronze.spark.read.load) instead of streaming, so it's not a streaming hop.In Databricks Medallion Architecture:
Option E correctly implements this pattern by streaming data from raw files directly into a Bronze table.