
Ultimate access to all questions.
You are working with a dataflow that is processing large volumes of data from multiple sources. The dataflow is experiencing performance issues due to data skew. Describe how you would identify data skew in the dataflow and what steps you would take to resolve it. Consider both data-level and configuration-level optimizations.
A
Use dynamic partitioning to distribute data processing tasks.
B
Increase the number of partitions to improve parallelism.
C
Use a random partitioning key to balance data across partitions.
D
Use a partitioning key based on commonly queried columns.