Databricks Certified Data Engineer - Professional

Databricks Certified Data Engineer - Professional

Get started today

Ultimate access to all questions.


You are tasked with optimizing the storage of a PySpark DataFrame on disk. Discuss how you would control the size of individual part-files when writing the DataFrame to disk. Explain the importance of this control and how it affects query performance.