Databricks Certified Data Engineer - Professional

Ultimate access to all questions.

Which statement accurately describes the proper usage of `pyspark.sql.functions.broadcast`?

Exam-Like

A

It marks a column as having low enough cardinality to properly map distinct values to available partitions, allowing a broadcast join.

8.6%

B

It marks a column as small enough to store in memory on all executors, allowing a broadcast join.

15.1%

C

Loading comments...

It caches a copy of the indicated table on attached storage volumes for all active clusters within a Databricks workspace.

6.6%

D

It marks a DataFrame as small enough to store in memory on all executors, allowing a broadcast join.

59.2%

E

It caches a copy of the indicated table on all nodes in the cluster for use in all future queries during the cluster lifetime.

10.5%

Powered ByGPT-5.2