Google Professional Data Engineer

Ultimate access to all questions.

Your team has been tasked with developing and maintaining Extract, Transform, and Load (ETL) processes within your company’s data infrastructure. Currently, you have a Dataflow job that is experiencing failures due to errors in the input data. To improve the reliability and resilience of this ETL pipeline, including the ability to reprocess all the failing data, what steps should you take?

Exam-Like

Add a filtering step to skip these types of errors in the future, extract erroneous rows from logs.

3.4%

Add a try... catch block to your DoFn that transforms the data, extract erroneous rows from logs.

10.3%

Loading comments...

Add a try... catch block to your DoFn that transforms the data, use a sideOutput to create a PCollection that can be stored to Pub/Sub later.

64.4%