
Ultimate access to all questions.
Your team has been tasked with developing and maintaining Extract, Transform, and Load (ETL) processes within your company’s data infrastructure. Currently, you have a Dataflow job that is experiencing failures due to errors in the input data. To improve the reliability and resilience of this ETL pipeline, including the ability to reprocess all the failing data, what steps should you take?
A
Add a filtering step to skip these types of errors in the future, extract erroneous rows from logs.
B
Add a try... catch block to your DoFn that transforms the data, extract erroneous rows from logs.
C
Add a try... catch block to your DoFn that transforms the data, write erroneous rows to Pub/Sub directly from the DoFn.
D
Add a try... catch block to your DoFn that transforms the data, use a sideOutput to create a PCollection that can be stored to Pub/Sub later.