Databricks Certified Associate Developer for Apache Spark

Databricks Certified Associate Developer for Apache Spark

Get started today

Ultimate access to all questions.


Which of the following code blocks correctly returns a new DataFrame with a modified storeDescription column where the prefix "Description: " has been removed from each value in the storeDescription column of DataFrame storesDF?

A sample of DataFrame storesDF is shown below:

storeId  storeDescription
0        Description: ultr...
1        Description: sagi...
2        Description: port...
3        Description: tris...
4        Description: ulla...





Explanation:

The question asks to remove the 'Description: ' prefix from the storeDescription column using regex. Options A and B are incorrect because A misses the replacement argument, and B uses an invalid method call. Option C uses regexp_extract, which extracts the pattern instead of replacing it. Options D and E both correctly use regexp_replace with three arguments. In PySpark, the first argument can be a column name (string) or a Column object, making both D and E valid. Thus, the correct answers are D and E.