Databricks Certified Data Analyst - Associate

Get started today

Ultimate access to all questions.

Explanation:

The correct answer is B.

Explanation

The result set shows:

All detailed combinations of group_1 and group_2 (e.g., A+Y, A+Z, B+Y, B+Z).
Subtotals per group_1 (rows where group_2 is null: A null = 50, B null = 50).
No subtotals per group_2 (there is no row like null + Y or null + Z as a subtotal).
A grand total (both group_1 and group_2 are null = 100).

This pattern is exactly what GROUP BY group_1, group_2 WITH ROLLUP produces in Databricks/Spark SQL.

How ROLLUP works (hierarchical subtotals)

ROLLUP on (col1, col2) generates these grouping sets automatically:

(group_1, group_2) → detailed rows
(group_1) → subtotals by group_1 (sets group_2 = null)
() → grand total (both columns null)

It follows the left-to-right hierarchy of the columns listed. It does not create the cross-subtotal (group_2) alone (i.e., no independent group_2 subtotals with group_1 = null).

Why the other options are incorrect

A: INCLUDING NULL is not valid Databricks SQL syntax for GROUP BY.
C: Plain GROUP BY group_1, group_2 returns only the detailed combinations. It produces no subtotal or grand total rows (no null rows at all).
D: Invalid/non-standard syntax. Grouping by the tuple (group_1, group_2) again does not create subtotals or grand totals.
E: WITH CUBE generates all possible combinations (full cross-tab):
- (group_1, group_2)
- (group_1)
- (group_2) ← extra rows (null + Y, null + Z)
- ()
The provided result is missing the (group_2) subtotal rows, so it cannot be CUBE.

Quick reference for the exam

Operator	Grouping sets generated for (col1, col2)	When to use
Plain GROUP BY	Only (col1, col2)	Basic aggregation
WITH ROLLUP	(col1,col2) + (col1) + ()	Hierarchical subtotals
WITH CUBE	(col1,col2) + (col1) + (col2) + ()	All combinations / cross-tabs
GROUPING SETS	Whatever you explicitly list	Full control

The table in the question has zero original NULLs, so all null values in the output are super-aggregate markers created by ROLLUP.

Explanation:

The correct answer is B.

Explanation

The result set shows:

All detailed combinations of group_1 and group_2 (e.g., A+Y, A+Z, B+Y, B+Z).
Subtotals per group_1 (rows where group_2 is null: A null = 50, B null = 50).
No subtotals per group_2 (there is no row like null + Y or null + Z as a subtotal).
A grand total (both group_1 and group_2 are null = 100).

This pattern is exactly what GROUP BY group_1, group_2 WITH ROLLUP produces in Databricks/Spark SQL.

How ROLLUP works (hierarchical subtotals)

ROLLUP on (col1, col2) generates these grouping sets automatically:

(group_1, group_2) → detailed rows
(group_1) → subtotals by group_1 (sets group_2 = null)
() → grand total (both columns null)

It follows the left-to-right hierarchy of the columns listed. It does not create the cross-subtotal (group_2) alone (i.e., no independent group_2 subtotals with group_1 = null).

Why the other options are incorrect

A: INCLUDING NULL is not valid Databricks SQL syntax for GROUP BY.
C: Plain GROUP BY group_1, group_2 returns only the detailed combinations. It produces no subtotal or grand total rows (no null rows at all).
D: Invalid/non-standard syntax. Grouping by the tuple (group_1, group_2) again does not create subtotals or grand totals.
E: WITH CUBE generates all possible combinations (full cross-tab):
- (group_1, group_2)
- (group_1)
- (group_2) ← extra rows (null + Y, null + Z)
- ()
The provided result is missing the (group_2) subtotal rows, so it cannot be CUBE.

Quick reference for the exam

Operator	Grouping sets generated for (col1, col2)	When to use
Plain GROUP BY	Only (col1, col2)	Basic aggregation
WITH ROLLUP	(col1,col2) + (col1) + ()	Hierarchical subtotals
WITH CUBE	(col1,col2) + (col1) + (col2) + ()	All combinations / cross-tabs
GROUPING SETS	Whatever you explicitly list	Full control

The table in the question has zero original NULLs, so all null values in the output are super-aggregate markers created by ROLLUP.

Comments (1)

Suneetha KakiMar 24, 2026 2:39 AM

Unable to view queris and options C& D

LeetQuiz .Mar 24, 2026 12:06 PM

Hi Suneetha, Thanks for your comment. Originally the question options are image-based, on the leetquiz, the option is text-based. This why the options are blank. We've manually convert the image to the text to update the options correctly. Thanks for your patience and pointing out.

A data analyst is processing a complex aggregation on a table with zero null values and their query returns the following result:

group_1 group_2 sum
null null 100
null Y 70
null Z 30
A null 50
A Y 30
A Z 20
B null 50
B Y 40
B Z 10

Which of the following queries did the analyst run to obtain the above result?

group_1	group_2	sum
null	null	100
null	Y	70
null	Z	30
A	null	50
A	Y	30
A	Z	20
B	null	50
B	Y	40
B	Z	10

Exam-Like

Last updated: June 21, 2026 at 14:02

-2

SELECT 
    group_1,
    group_2,
    count(values) AS count
FROM my_table
GROUP BY group_1, group_2 INCLUDING NULL;

SELECT 
    group_1,
    group_2,
    count(values) AS count
FROM my_table
GROUP BY group_1, group_2 INCLUDING NULL;

31.2%

SELECT 
    group_1,
    group_2,
    count(values) AS count
FROM my_table
GROUP BY group_1, group_2 WITH ROLLUP;

SELECT 
    group_1,
    group_2,
    count(values) AS count
FROM my_table
GROUP BY group_1, group_2 WITH ROLLUP;

35.5%

SELECT 
    group_1,
    group_2,
    count(values) AS count
FROM my_table
GROUP BY group_1, group_2;

SELECT 
    group_1,
    group_2,
    count(values) AS count
FROM my_table
GROUP BY group_1, group_2;

Databricks Certified Data Analyst - Associate

Get started today

Explanation

How ROLLUP works (hierarchical subtotals)

Why the other options are incorrect

Quick reference for the exam

Explanation

How ROLLUP works (hierarchical subtotals)

Why the other options are incorrect

Quick reference for the exam

Comments (1)

Get started today

Comments (1)

A data analyst is processing a complex aggregation on a table with zero null values and their query returns the following result: group_1group_2sumnullnull100nullY70nullZ30Anull50AY30AZ20Bnull50BY40BZ10 Which of the following queries did the analyst run to obtain the above result?