Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
What is the most efficient method to handle medium-sized datasets (~100MB) in Hyperopt with SparkTrials, and why?
A
Use Databricks Runtime 7.0 ML or above for optimized handling of medium-sized datasets.
B
Broadcast the dataset explicitly using Spark and load it back onto workers using the broadcasted variable in the objective function.
C
Save the dataset to DBFS and load it back onto workers using the DBFS local file interface.
D
Load the dataset on the driver and call it directly from the objective function.