
Ultimate access to all questions.
You are given a Spark DataFrame 'df' with a numerical column 'score'. Write a code snippet that computes the variance and skewness of the 'score' column using Spark SQL functions, and explain the steps involved.
A
from pyspark.sql.functions import variance, skewness result = df.select(variance('score'), skewness('score')) print(A)
B
result = df.selectExpr('VARIANCE(score) as variance', 'SKEWNESS(score) as skewness') print(B)
C
result = df.select('score').summary('variance', 'skewness') print(C)
D
result = df.describe() print(result['score']['variance'], result['score']['skewness'])