
Answer-first summary for fast verification
Answer: ```sql SELECT *, CASE WHEN age < 18 THEN 'Minor' ELSE 'Adult' END AS age_group FROM df
Option A correctly uses the CASE WHEN statement to categorize individuals into 'Minor' or 'Adult' based on the specified age criteria, making it the correct choice. Option E also correctly implements the logic but is more verbose. Options B, C, and D either do not use the required CASE WHEN statement or incorrectly implement the age categorization logic. The question specifically asks for the use of a CASE WHEN statement, making options B, C, and D incorrect.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
In a data engineering project, you are working with a DataFrame named 'df' that contains columns id, name, age, and gender. The project requires categorizing individuals into 'Minor' or 'Adult' based on their age, with 'Minor' being those under 18 and 'Adult' being 18 or older. The solution must use Spark SQL and specifically a CASE WHEN statement for this categorization. Additionally, the solution must be optimized for performance and readability. Considering these requirements, which of the following Spark SQL queries correctly implements the age_group categorization using a CASE WHEN statement? (Choose one correct option.)
A
SELECT *,
CASE WHEN age < 18
THEN 'Minor'
ELSE 'Adult'
END AS age_group
FROM df
SELECT *,
CASE WHEN age < 18
THEN 'Minor'
ELSE 'Adult'
END AS age_group
FROM df
B
SELECT *,
IF(age < 18, 'Minor', 'Adult') AS age_group
FROM df
SELECT *,
IF(age < 18, 'Minor', 'Adult') AS age_group
FROM df
C
SELECT *,
CASE WHEN age > 18
THEN 'Adult' ELSE 'Minor'
END AS age_group
FROM df
SELECT *,
CASE WHEN age > 18
THEN 'Adult' ELSE 'Minor'
END AS age_group
FROM df
D
SELECT *,
CASE WHEN age = 18
THEN 'Adult'
ELSE 'Minor'
END AS age_group
FROM df`
SELECT *,
CASE WHEN age = 18
THEN 'Adult'
ELSE 'Minor'
END AS age_group
FROM df`