Ultimate access to all questions.
Discuss the impact of improper data partitioning on Spark query performance. Provide examples of how 'smalls' (tiny files, scanning overhead, over partitioning) can induce performance problems and suggest strategies to mitigate these issues. Include a code snippet demonstrating how to optimize partitioning in a Spark DataFrame.