
Databricks Certified Associate Developer for Apache Spark
Get started today
Ultimate access to all questions.
Which of the following code blocks returns a new DataFrame where column managerName
from DataFrame storesDF
has had its null values replaced with the string "No Manager"?
A sample of DataFrame storesDF
is below:
storeId managerName
0 Donec Enim
1 Ultrices Fringilla
2 null
3 Magna Ac
4 null
Which of the following code blocks returns a new DataFrame where column managerName
from DataFrame storesDF
has had its null values replaced with the string "No Manager"?
A sample of DataFrame storesDF
is below:
storeId managerName
0 Donec Enim
1 Ultrices Fringilla
2 null
3 Magna Ac
4 null
Exam-Like
Explanation:
The correct method to replace missing values in Spark is DataFrame.na.fill()
, which takes the value and a subset of columns. Option A correctly uses na.fill("No Manager", "managerName")
where the subset is specified as a string. Other options have issues: B and E use nafill
(typo), C and D use col("managerName")
which is incorrect for the subset parameter expecting a column name string.