Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
Which statement accurately describes the proper usage of pyspark.sql.functions.broadcast?
pyspark.sql.functions.broadcast
A
It marks a column as having low enough cardinality to properly map distinct values to available partitions, allowing a broadcast join.
B
It marks a column as small enough to store in memory on all executors, allowing a broadcast join.
C
It caches a copy of the indicated table on attached storage volumes for all active clusters within a Databricks workspace.
D
It marks a DataFrame as small enough to store in memory on all executors, allowing a broadcast join.
E
It caches a copy of the indicated table on all nodes in the cluster for use in all future queries during the cluster lifetime.