
Explanation:
The question asks for a 15% sample without replacement. In PySpark, the sample method is used with withReplacement defaulting to False if not specified. Option E uses sample(fraction=0.15), which correctly sets the fraction to 15% and uses the default False for replacement. Other options either use incorrect fractions (A, C), use sampleBy which is for stratified sampling (B), or have missing parameters (D).
Ultimate access to all questions.
Which of the following code blocks returns a 15% sample of rows from the DataFrame storesDF without replacement?
A
storesDF.sample(fraction = 0.10)
B
storesDF.sampleBy(fraction = 0.15)
C
storesDF.sample(True, fraction = 0.10)
D
storesDF.sample()
E
storesDF.sample(fraction = 0.15)
No comments yet.