Databricks Certified Machine Learning - Associate

Ultimate access to all questions.

In the context of feature engineering using Spark ML, explain the process of handling categorical features and encoding them for use in machine learning models. Provide a code snippet demonstrating the encoding of categorical features using Spark ML transformers.

Simulated

Use the StringIndexer transformer from the pyspark.ml.feature module to convert categorical features to numerical indices.

20.0%

Use the OneHotEncoder transformer from the pyspark.ml.feature module to convert categorical features to a binary vector representation.

Loading comments...

Use the VectorAssembler transformer from the pyspark.ml.feature module to combine categorical features into a single vector column.

16.7%

Use the Bucketizer transformer from the pyspark.ml.feature module to map categorical features to a fixed-size vector of values.

6.7%