LeetQuiz Logo
Privacy Policy•contact@leetquiz.com
© 2025 LeetQuiz All rights reserved.
Databricks Certified Data Engineer - Professional

Databricks Certified Data Engineer - Professional

Get started today

Ultimate access to all questions.


A data engineer needs to correlate advertisement impressions with user clicks by joining two streaming DataFrames. The Impressions stream has a watermark set on "event_time" for 10 minutes. The current implementation is:

impressions \
  .groupBy(
    window("event_time", "5 minutes"),
    "id") \
  .count() \
  .withWatermark("event_time", "2 hours") \
  .join(clicks, expr("clickAdId = impressionAdId"), "inner")

The query performance is degrading significantly. What solution would improve its performance?

Exam-Like



Powered ByGPT-5