
Explanation:
Option B is the correct approach for scaling a data pipeline that processes streaming data in real-time. Adding more nodes to the cluster helps distribute the processing load, allowing the pipeline to handle increased data volumes without performance degradation. While options A, C, and D can also contribute to performance improvements, they do not address the need for scaling out the processing capacity as effectively as adding more nodes.
Ultimate access to all questions.
You are working on a data pipeline that processes streaming data in real-time. The pipeline has started to experience performance issues due to the increasing volume of data. What strategies would you employ to scale the pipeline and maintain its performance?
A
Increase the processing power of the existing nodes in the cluster.
B
Add more nodes to the cluster to distribute the processing load.
C
Implement data sampling to reduce the volume of data processed by the pipeline.
D
Optimize the data processing algorithms to improve their efficiency.
No comments yet.