Ultimate access to all questions.
The data engineering team is working with a large Delta Lake table named 'user_posts', partitioned by the 'year' column. This table serves as a streaming source for a job. The streaming query is partially shown below, with a blank to fill in:
.table("user_posts")
________________
.groupBy("post_category", "post_date")
.agg(
count("psot_id").alias("posts_count"),
sum("likes").alias("total_likes")
)
.writeStream
.option("checkpointLocation", "dbfs:/path/checkpoint")
.table("psots_stats")
The team aims to delete data from the previous 2 years without violating the append-only requirement of streaming sources. Which option correctly fills the blank to ensure the table remains streamable after partition deletion?