
Answer-first summary for fast verification
Answer: Setup a Kafka Connect bridge between Kafka and Pub/Sub. Write a DataFlow pipeline, read the data from Pub/Sub, and write the data to BigQuery.
Option D is the correct answer. Setting up a Kafka Connect bridge between Kafka and Pub/Sub and then writing a Dataflow pipeline to read the data from Pub/Sub and write to BigQuery provides lower latency and higher throughput. Pub/Sub acts as a buffer and handles scaling/reliability of streaming data automatically, which reduces processing burden on the pipeline and minimizes latency compared to directly reading from Kafka.
Author: LeetQuiz Editorial Team
Ultimate access to all questions.
Your infrastructure team has established an interconnect link between Google Cloud and the on-premises network, providing a direct and secure connection. You are tasked with designing a high-throughput streaming pipeline to ingest real-time data from an Apache Kafka cluster that is hosted within the on-premises environment. The goal is to store this data in BigQuery, Google Cloud's fully managed, serverless data warehouse, while ensuring minimal latency in the data transfer and storage process. What steps should you take to achieve this objective?
A
Setup a Kafka Connect bridge between Kafka and Pub/Sub. Use a Google-provided DataFlow template to read the data from Pub/Sub, and write the data to BigQuery.
B
Use a proxy host in the VPC in Google Cloud connecting to Kafka. Write a DataFlow pipeline, read data from the proxy host, and write the data to BigQuery.
C
Use DataFlow, write a pipeline that reads the data from Kafka, and writes the data to BigQuery.
D
Setup a Kafka Connect bridge between Kafka and Pub/Sub. Write a DataFlow pipeline, read the data from Pub/Sub, and write the data to BigQuery.
No comments yet.