
Databricks Certified Associate Developer for Apache Spark
Get started today
Ultimate access to all questions.
Which of the following code blocks correctly returns a new DataFrame with a modified storeDescription
column where the prefix "Description: " has been removed from each value in the storeDescription
column of DataFrame storesDF
?
A sample of DataFrame storesDF
is shown below:
storeId storeDescription
0 Description: ultr...
1 Description: sagi...
2 Description: port...
3 Description: tris...
4 Description: ulla...
Which of the following code blocks correctly returns a new DataFrame with a modified storeDescription
column where the prefix "Description: " has been removed from each value in the storeDescription
column of DataFrame storesDF
?
A sample of DataFrame storesDF
is shown below:
storeId storeDescription
0 Description: ultr...
1 Description: sagi...
2 Description: port...
3 Description: tris...
4 Description: ulla...
Explanation:
The question asks to remove the 'Description: ' prefix from the storeDescription column using regex. Options A and B are incorrect because A misses the replacement argument, and B uses an invalid method call. Option C uses regexp_extract, which extracts the pattern instead of replacing it. Options D and E both correctly use regexp_replace with three arguments. In PySpark, the first argument can be a column name (string) or a Column object, making both D and E valid. Thus, the correct answers are D and E.