Databricks Certified Data Engineer - Associate

Databricks Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.


In a data engineering project, you are tasked with categorizing individuals into specific age groups for a comprehensive analysis. The DataFrame 'df' contains columns 'id', 'name', and 'age'. The age groups are defined as follows: 'Child' (0-12 years), 'Teen' (13-19 years), 'Adult' (20-64 years), and 'Senior' (65+ years). The solution must accurately classify each individual into the correct age group and handle NULL values appropriately to avoid misclassification. Given these requirements, which of the following Spark SQL queries using the CASE/WHEN statement correctly assigns a new column 'age_group' based on the specified age ranges and ensures NULL values are handled correctly? Choose the two best options.