
Ultimate access to all questions.
You work for a global clothing retailer and have been assigned the task of ensuring that machine learning models are built in a secure and compliant manner. Customer data used in these models contains sensitive information that needs to be protected. The specific fields identified as sensitive by your data science team are AGE, IS_EXISTING_CUSTOMER, LATITUDE_LONGITUDE, and SHIRT_SIZE. What steps should you take to safeguard this sensitive data before making it available for model training?
A
Tokenize all of the fields using hashed dummy values to replace the real values.
B
Use principal component analysis (PCA) to reduce the four sensitive fields to one PCA vector.
C
Coarsen the data by putting AGE into quantiles and rounding LATITUDE_LONGITUDE into single precision. The other two fields are already as coarse as possible._
D
Remove all sensitive data fields, and ask the data science team to build their models using non-sensitive data.