
Ultimate access to all questions.
You are currently testing a Google Cloud Dataflow pipeline designed to ingest and transform text files. These text files are compressed using gzip, and any errors encountered during the process are written to a dead-letter queue. Additionally, this pipeline makes use of SideInputs to join data. However, you have observed that the pipeline is taking longer to finish than anticipated. What steps should you take to expedite the Dataflow job?
A
Switch to compressed Avro files.
B
Reduce the batch size.
C
Retry records that throw an error.
D
Use CoGroupByKey instead of the SideInput.