
Answer-first summary for fast verification
Answer: Use a Relative Expression Split module to partition the data based on centroid distance.
The question asks for a model development strategy to predict user propensity to respond to ads, with the scenario mentioning that local market segmentation models will be applied and feature distributions across training and production data are inconsistent. The Relative Expression Split module is optimal here because it allows partitioning data based on numerical expressions (e.g., centroid distance for market segmentation), ensuring training data reflects production conditions by using meaningful numerical criteria. This aligns with the community consensus (e.g., juandante's comment with 9 upvotes) and Microsoft documentation, which highlights its use for conditions on number columns like distances. Option A specifically uses centroid distance, which is relevant for segmentation. Other options are less suitable: B uses 'distance travelled to the event,' which is not mentioned in the scenario; C and D use the Split Rows module, which splits data randomly or by fraction, not by expression, making it inadequate for addressing distribution inconsistencies.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
You need to implement a model development strategy to predict a user's propensity to respond to an advertisement. Which technique should you use?
A
Use a Relative Expression Split module to partition the data based on centroid distance.
B
Use a Relative Expression Split module to partition the data based on distance travelled to the event.
C
Use a Split Rows module to partition the data based on distance travelled to the event.
D
Use a Split Rows module to partition the data based on centroid distance.
No comments yet.