
Answer-first summary for fast verification
Answer: imputer_model = imputer.fit(doubles_df)
An algorithm which can be fit on a data frame to produce a Transformer. For example, a learning algorithm is an Estimator which trains on a data frame and produces a model. It has a `.fit()` method because it learns (or 'fits') parameters from your data frame. The correct code snippet is: ```python from pyspark.ml.feature import Imputer imputer = Imputer(strategy='median', inputCols=impute_cols, outputCols=impute_cols) imputer_model = imputer.fit(doubles_df) imputed_df = imputer_model.transform(doubles_df) ```
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
What is the correct line of code to complete the snippet for imputing missing values using the median strategy in PySpark?
A
No need to add anything
B
imputer_model = imputer.fit(doubles_df)
C
imputer_model = doubles_df.fit()
D
Change the imputer constructor to most_frequent
E
Imputer(strategy=“most_frequent“, inputCols=impute_cols, outputCols=impute_cols)