
Answer-first summary for fast verification
Answer: All of the above techniques can be applied depending on the specific characteristics of the data and the model's requirements., Log scaling, which is particularly effective for data with non-uniform distributions by compressing the data range through logarithmic transformation.
Each technique mentioned has its own advantages and can be applied based on the specific needs of the model and the data characteristics. Log scaling is beneficial for non-uniform distributions, scaling to a range is suitable for uniformly distributed data, clipping helps in removing extreme outliers, and Z-score normalization provides a way to understand variability. The choice between these techniques should consider the model's performance requirements, data distribution, and compliance with budget and privacy constraints. For a comprehensive understanding, refer to Google's Normalization Guide.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
No comments yet.
Your industrial company has developed a machine learning model to predict the optimal purchase price for goods, taking into account various factors such as quantity, quality, and product-specific variables. The current model is a linear regression that performs adequately but you are tasked with improving its efficiency and accuracy. The company operates under strict budget constraints and requires the solution to be scalable across different product lines. Additionally, the solution must comply with data privacy regulations. Considering these constraints, which of the following techniques would be most effective for optimizing the model's performance? (Choose two correct options)
A
Log scaling, which is particularly effective for data with non-uniform distributions by compressing the data range through logarithmic transformation.
B
Scaling to a range, which adjusts feature values to a standardized range, such as 0 to 1 or -1 to +1, suitable for uniformly distributed data.
C
Clipping, which removes extreme outliers by ensuring data falls within a specified range, thus improving model stability.
D
Z-score normalization, which measures how many standard deviations a value is from the mean, providing a sense of variability and normalizing the data.
E
All of the above techniques can be applied depending on the specific characteristics of the data and the model's requirements.