Databricks Certified Machine Learning - Associate

Ultimate access to all questions.

You are working on a machine learning project in Spark and need to perform feature engineering on a Pandas DataFrame within a Pandas UDF. Explain the steps you would take to implement feature engineering and provide an example of how you would apply it.

Simulated

Define a Pandas UDF that takes the Pandas DataFrame as input, performs the feature engineering, and returns the transformed DataFrame.

71.4%

Use the Pandas UDF to perform feature engineering on each row of the Pandas DataFrame individually.

Loading comments...

Use the Pandas UDF to perform feature engineering on the entire Pandas DataFrame at once, without considering any row-specific differences.

4.8%

Use the Pandas UDF to perform feature engineering on a subset of the Pandas DataFrame, ignoring the rest of the data.