
Answer-first summary for fast verification
Answer: Utilize a Databricks built-in function for string normalization in a Spark SQL query.
### ✅ Correct Answer: **C** In Databricks (Spark), **built-in functions** such as `lower()` and `regexp_replace()` are: - **Optimized by Spark’s Catalyst optimizer** - Executed in **native Spark engine code** - Faster and more scalable than UDFs Example: ```sql SELECT regexp_replace(lower(text_column), ' ', '') AS normalized_text FROM table; ```
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
In a Databricks notebook, a data engineer is tasked with normalizing text data by converting all strings to lowercase and removing spaces. What is the most efficient method to achieve this within Databricks?
A
Manually edit each string in the dataset before loading it into Databricks.
B
Export the dataset, process it with a script outside of Databricks, and then re-import the cleaned data.
C
Utilize a Databricks built-in function for string normalization in a Spark SQL query.
D
Implement a UDF in Databricks that lowercases and trims spaces from strings, applying it via a Spark SQL query.