A Spark application has a 128 GB DataFrame A and a 1 GB DataFrame B. If a broadcast join were to be performed on these two DataFrames, which of the following describes which DataFrame should be broadcast and why?

Exam-Like

Either DataFrame can be broadcasted. Their results will be identical in result and efficiency.

0.0%

DataFrame B should be broadcasted because it is smaller and will eliminate the need for the shuffling of itself.

47.5%

DataFrame A should be broadcasted because it is larger and will eliminate the need for the shuffling of DataFrame B.

5.7%

DataFrame B should be broadcasted because it is smaller and will eliminate the need for the shuffling of DataFrame A.

45.9%

DataFrame A should be broadcasted because it is smaller and will eliminate the need for the shuffling of itself.

0.8%

Databricks Certified Associate Developer for Apache Spark