
Answer-first summary for fast verification
Answer: Use a Relative Expression Split module to partition the data based on centroid distance.
The question asks for a model development strategy to predict user propensity to respond to ads, with the scenario mentioning that local market segmentation models will be applied and feature distributions across training and production data are inconsistent. The Relative Expression Split module is optimal here because it allows partitioning data based on numerical expressions (e.g., centroid distance for market segmentation), ensuring training and test sets reflect real-world variations. This aligns with the community consensus (e.g., juandante's comment with 9 upvotes) and Microsoft documentation, which states Relative Expression Split is used for conditions on numerical columns like distances. Options C and D (Split Rows) are less suitable as they split data randomly or by fixed ratios, not accounting for feature distribution inconsistencies. Option B is incorrect because distance travelled to an event is not mentioned in the scenario.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You need to implement a model development strategy to predict a user's propensity to respond to an advertisement. Which technique should you use?
A
Use a Relative Expression Split module to partition the data based on centroid distance.
B
Use a Relative Expression Split module to partition the data based on distance travelled to the event.
C
Use a Split Rows module to partition the data based on distance travelled to the event.
D
Use a Split Rows module to partition the data based on centroid distance.
No comments yet.