Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
To optimize a join operation in Databricks by ensuring the smaller DataFrame is sent to all executor nodes in the cluster, which function should a data engineer use to mark the DataFrame as small enough to fit in memory on all executors?
A
pyspark.sql.functions.explode
B
pyspark.sql.functions.distribute
C
pyspark.sql.functions.broadcast
D
pyspark.sql.functions.diffuse
E
pyspark.sql.functions.shuffle