Databricks Certified Machine Learning - Associate

Databricks Certified Machine Learning - Associate

Get started today

Ultimate access to all questions.


In a big data environment, you are tasked with implementing a UDF that processes a very large dataset. Which type of Pandas UDF would you prefer and why?




Explanation:

Iterator UDFs are preferred for large datasets because they allow processing data in chunks, which can help manage memory usage and improve performance. This is particularly important in big data scenarios where the entire dataset might not fit into memory.