Google Professional Data Engineer

Google Professional Data Engineer

Get started today

Ultimate access to all questions.


You are managing a Google Cloud Dataflow streaming pipeline that currently uses a Google Cloud Pub/Sub subscription as its data source. You need to deploy an updated version of this Dataflow pipeline, but you know that this new version will not be compatible with the existing pipeline version. Despite this incompatibility, it is crucial to ensure that no data is lost during the transition to the updated pipeline. How should you proceed to achieve this?




Explanation:

The correct answer is A. This option is correct because the key requirement is not to lose any data. The Dataflow pipeline can be stopped using the Drain option. The Drain option causes Dataflow to stop processing new data but allows the existing data to be processed to completion. This ensures that no data is lost during the update. While options C and D would involve canceling the old pipeline, which can result in data loss, and option B requires transform mapping that might not ensure compatibility, using the drain flag crucially allows for a seamless transition without data loss.