Ultimate access to all questions.
A data engineer needs to create a PySpark DataFrame transformation that applies a custom function to a column of integers called sales in order to return the square of each value. Which of the following code blocks correctly defines and applies a Spark UDF to accomplish this task?
Explanation:
Correct Answer: A and E are both correct and idiomatic. D is incorrect because it passes the type class, not an instance. B is incorrect because it tries to use a Python function directly on a DataFrame column. C is incorrect because it passes the column name as a string, not the column object.