Google Professional Data Engineer

Ultimate access to all questions.

Explanation:

Explanation

Option C is the correct answer because:

Try-catch block in DoFn: This handles errors during data transformation gracefully without stopping the entire pipeline
Write erroneous rows to PubSub directly: This ensures that all failing data is captured and stored in a durable message queue for later reprocessing
Reprocessing capability: PubSub allows you to replay messages, enabling reprocessing of failed data
Reliability: The pipeline continues processing valid data while capturing errors separately

Why other options are less optimal:

Option A: Filtering skips errors but doesn't capture them for reprocessing
Option B: Extracting from logs is unreliable and difficult to reprocess systematically
Option D: Using sideOutputs is good for error handling, but storing to PubSub "later" adds complexity and potential data loss

This approach ensures pipeline reliability while maintaining the ability to reprocess all failing data efficiently.

Explanation:

Option C is the correct answer because:

Try-catch block in DoFn: This handles errors during data transformation gracefully without stopping the entire pipeline
Write erroneous rows to PubSub directly: This ensures that all failing data is captured and stored in a durable message queue for later reprocessing
Reprocessing capability: PubSub allows you to replay messages, enabling reprocessing of failed data
Reliability: The pipeline continues processing valid data while capturing errors separately

Why other options are less optimal:

Option A: Filtering skips errors but doesn't capture them for reprocessing
Option B: Extracting from logs is unreliable and difficult to reprocess systematically
Option D: Using sideOutputs is good for error handling, but storing to PubSub "later" adds complexity and potential data loss

This approach ensures pipeline reliability while maintaining the ability to reprocess all failing data efficiently.

No comments yet.

Real Exam

Community

LLeetQuiz

Add a filtering step to skip these types of errors in the future, extract erroneous rows from logs.

4.5%

Add a try... catch block to your DoFn that transforms the data, extract erroneous rows from logs.

6.1%

Add a try... catch block to your DoFn that transforms the data, write erroneous rows to PubSub directly from the DoFn.

54.5%

Add a try... catch block to your DoFn that transforms the data, use a sideOutput to create a PCollection that can be stored to PubSub later.

34.8%