Ultimate access to all questions.
A data scientist is working with a feature set that includes the following schema: customer_id STRING
, spend DOUBLE
, units INTEGER
, happiness_tier STRING
. The customer_id
column serves as the primary key. Each column in the feature set contains some missing values. The scientist plans to replace these missing values by imputing a common value for each feature. Which columns from the feature set are best suited for imputation using the most common value (mode) of the column?