LeetQuiz Logo
Privacy Policy•contact@leetquiz.com
© 2025 LeetQuiz All rights reserved.
Databricks Certified Data Engineer - Associate

Databricks Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.


In a data engineering project using Azure Databricks, you are working with a DataFrame named 'df' that contains transaction data. The DataFrame includes columns for 'transaction_id', 'product_id', and 'amount'. Your task is to analyze this data to understand transaction volumes, data quality issues, and product diversity. Specifically, you need to write a Spark SQL query that calculates: (1) the total number of transactions, (2) the number of transactions with NULL values in the 'amount' column (indicating missing data), and (3) the number of unique 'product_id' values (to assess product diversity). Considering the importance of accurate data analysis for decision-making, which of the following queries correctly accomplishes these tasks? Choose the best option from the four provided.

Simulated



Powered ByGPT-5