Databricks Certified Associate Developer for Apache Spark

Databricks Certified Associate Developer for Apache Spark

Get started today

Ultimate access to all questions.


Which of the following code blocks returns a new DataFrame where column managerName from DataFrame storesDF has had its null values replaced with the string "No Manager"?

A sample of DataFrame storesDF is below:

storeId  managerName
0        Donec Enim
1        Ultrices Fringilla
2        null
3        Magna Ac
4        null





Explanation:

The correct method to replace missing values in Spark is DataFrame.na.fill(), which takes the value and a subset of columns. Option A correctly uses na.fill("No Manager", "managerName") where the subset is specified as a string. Other options have issues: B and E use nafill (typo), C and D use col("managerName") which is incorrect for the subset parameter expecting a column name string.