
Explanation:
The correct answer is A. The withColumn method is used to add a new column to a DataFrame, and it requires the name of the new column as a string and an expression that defines the column's values. In option A, the expression correctly uses the col function to reference the existing columns numberOfEmployees and sqft, and performs the division operation to create the new column employeesPerSqft. Option B is incorrect because it attempts to divide strings directly, which is not valid. Option C and D are incorrect because they use the select method, which would not preserve the existing columns of storesDF unless explicitly included in the select statement, and their syntax for creating the new column is not valid. Option E is incorrect because it passes a Column object as the first argument to withColumn instead of a string, which is not the expected method signature.
Ultimate access to all questions.
No comments yet.
Which of the following code blocks creates a new DataFrame with a column named employeesPerSqft calculated as the division of numberOfEmployees by sqft from the original DataFrame storesDF? The employeesPerSqft column does not exist in storesDF.
A
storesDF.withColumn("employeesPerSqft", col("numberOfEmployees") / col("sqft"))
B
storesDF.withColumn("employeesPerSqft", "numberOfEmployees" / "sqft")
C
storesDF.select("employeesPerSqft", "numberOfEmployees" / "sqft")
D
storesDF.select("employeesPerSqft", col("numberOfEmployees") / col("sqft"))
E
storesDF.withColumn(col("employeesPerSqft"), col("numberOfEmployees") / col("sqft"))