Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
In the context of Pandas UDFs, explain the concept of broadcasting and its benefits when working with small control data in Spark. Provide an example of how you would use broadcasting in a Pandas UDF.
A
Broadcasting is a technique where a small dataset is replicated across all nodes in a cluster to enable efficient distributed processing.
B
Broadcasting is a technique where a small dataset is partitioned across multiple nodes in a cluster to enable parallel processing.
C
Broadcasting is a technique where a small dataset is loaded into memory on a single node and accessed by all other nodes in the cluster.
D
Broadcasting is not applicable when working with Pandas UDFs in Spark.