Ultimate access to all questions.
As a global retailer, you are tasked with securely constructing ML models to predict customer preferences while ensuring the protection of sensitive customer data. The data includes four sensitive fields: AGE, IS_EXISTING_CUSTOMER, LATITUDE_LONGITUDE, and SHIRT_SIZE. Given the constraints of maintaining data utility for model training and ensuring compliance with global data protection regulations, which of the following steps should you take with this data before it's used by the data science team? Choose two correct options.
Explanation:
Tokenizing the fields (C) and applying differential privacy techniques (A) are the correct approaches because they effectively protect customer data while retaining its utility for model training. Tokenization substitutes real values with hashed dummy values, ensuring confidentiality, while differential privacy adds noise to the data, preventing identification of individual data points without significantly altering the data distribution. These methods comply with data protection regulations and maintain the data's usefulness for the data science team.