Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
In a data processing pipeline, you have identified a skew in the data distribution across different partitions. How would you handle this skew to ensure balanced processing and avoid hotspots in your distributed system?
A
Increase the number of partitions to distribute the data more evenly.
B
Implement a custom partitioning logic that redistributes the skewed data across existing partitions.
C
Ignore the skew and let the system handle it naturally.
D
Use a salting technique to add a random element to the data keys to distribute the load more evenly.