
Ultimate access to all questions.
A data scientist is developing a Databricks notebook that requires extensive feature engineering, such as creating new columns and applying transformations. They aim to encapsulate this feature engineering logic into a reusable component. What is the best approach to accomplish this?
A
Save the intermediate DataFrame and load it in other notebooks as needed.
B
Use the map function to apply the feature engineering transformations.
C
Define a custom PySpark UDF (User-Defined Function) and apply it to the DataFrame.
D
Create a Spark MLlib Transformer class for the feature engineering logic.