
Explanation:
The correct answer is B) Configure parallelism based on the number of GPU-enabled instances, avoiding conflicts among Spark tasks. Key considerations include:
Executor Thread Configuration:
Parallelism Adjustment:
How to configure parallelism:
spark.executor.instances, spark.executor.cores, and spark.default.parallelism according to the number of GPUs.Example Configuration:
spark = SparkSession.builder.appName("my_app").config(
spark.executor.instances=4,
spark.executor.cores=1,
spark.default.parallelism=4
).getOrCreate()
spark = SparkSession.builder.appName("my_app").config(
spark.executor.instances=4,
spark.executor.cores=1,
spark.default.parallelism=4
).getOrCreate()
Additional Tips:
Ultimate access to all questions.
No comments yet.
When utilizing SparkTrials on GPU-enabled clusters, what is the recommended approach for configuring parallelism to ensure optimal performance and avoid conflicts?
A
GPU clusters automatically adjust parallelism based on SparkTrials configuration, requiring no manual intervention.
B
For GPU clusters, it's best to configure parallelism to match the number of GPU-enabled instances, preventing task conflicts.
C
GPU clusters utilize maximum parallelism by default, eliminating the need for specific configuration adjustments.
D
Optimal performance on GPU clusters is achieved by using multiple executor threads per node for enhanced parallelism.