
Explanation:
The default value of spark.sql.autoBroadcastJoinThreshold is 10 MB. Spark will automatically broadcast the smaller table in a join if its size is less than or equal to 10 MB. Tables larger than this threshold will not be automatically broadcasted.
In this case:
Table 1 (15.2 MB) → larger than 10 MB → not broadcasted
Table 2 (2.6 MB) → smaller than 10 MB → broadcasted
Table 3 (5.1 MB) → smaller than 10 MB → broadcasted
Table 4 (11 MB) → larger than 10 MB → not broadcasted
Table 1 (15.2 MB) → larger than 10 MB → not broadcasted
Table 2 (2.6 MB) → smaller than 10 MB → broadcasted
Table 3 (5.1 MB) → smaller than 10 MB → broadcasted
Table 4 (11 MB) → larger than 10 MB → not broadcasted
Therefore, the tables that will not be automatically broadcasted are Table 1 and Table 4.
Correct Answer: D.
More info: spark.sql.autoBroadcastJoinThreshold.
Ultimate access to all questions.
No comments yet.
Given the default setting of spark.sql.autoBroadcastJoinThreshold, which of the following tables will not be automatically broadcasted when joining with a table that is 2 GB in size?
A
Tables 1, 3 and 4
B
Tables 2 and 3
C
Tables 1, 2 and 3
D
Tables 1 and 4
E
Tables 2, 3 and 4