
Answer-first summary for fast verification
Answer: explode
The `explode` function in Spark SQL is specifically designed to transform array or map columns into multiple rows, which is exactly what's needed when converting nested JSON structures into a tabular format with multiple rows. **Why EXPLODE is correct:** - When JSON data contains nested arrays, each element in the array needs to become a separate row in the resulting DataFrame - The `explode` function takes an array column and creates a new row for each element in the array - This is essential for flattening hierarchical JSON structures into relational tables - For example, if a JSON object contains an array of items, `explode` will create one row per item while duplicating the other fields **Why other options are incorrect:** - **FILTER**: Used to select rows based on conditions, not for transforming nested structures into multiple rows - **COALESCE**: Returns the first non-null value from a list of expressions, used for handling null values, not for array expansion - **EXTRACT**: Used for pattern matching and extracting substrings from strings, not for converting nested JSON arrays into multiple rows In Azure Databricks Spark jobs, `explode` is the standard function for this transformation when working with nested JSON data that contains arrays.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.