LeetQuiz Logo
Privacy Policy•contact@leetquiz.com
© 2025 LeetQuiz All rights reserved.
Databricks Certified Data Engineer - Associate

Databricks Certified Data Engineer - Associate

Get started today

Ultimate access to all questions.


In a Databricks environment, you are working with a dataset that includes a 'customer_data' column. This column contains JSON objects with customer information, including an 'address' field that is a nested JSON object with 'city' and 'zipcode' among other details. Your task is to write a Spark SQL query that extracts the 'city' and 'zipcode' from the 'address' field and creates a new table with these extracted values. Consider the following constraints: the solution must be efficient, scalable, and must correctly handle nested JSON structures. Which of the following queries achieves this goal? Choose the best option from the four provided.

Simulated



Explanation:

Option D is the correct answer because it effectively uses the JSON_EXTRACT function to directly access and extract the 'city' and 'zipcode' from the nested 'address' JSON object within the 'customer_data' column. This method is efficient and scalable, making it suitable for large datasets. Option A incorrectly attempts to use dot notation to access nested JSON fields, which is not supported in Spark SQL for nested JSON objects. Option B misuses the JSON_TABLE function with incorrect syntax and fails to properly extract the specified fields. Option C incorrectly applies square bracket notation, which is not the correct syntax for JSON extraction in Spark SQL.

Powered ByGPT-5