Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
Why are iterator UDFs preferred for handling large datasets in Spark? Provide a detailed explanation and an example of how you would implement an iterator UDF in Spark.
A
Iterator UDFs are preferred for large datasets because they allow for lazy evaluation of data, reducing memory usage.
B
Iterator UDFs are preferred for large datasets because they enable parallel processing of data, improving performance.
C
Iterator UDFs are preferred for large datasets because they provide better error handling and debugging capabilities.
D
Iterator UDFs are not preferred for large datasets in Spark.