Google Professional Data Engineer

Get started today

Ultimate access to all questions.

Explanation:

The recommended approach involves setting up a Kafka cluster on Google Compute Engine (GCE) VM instances and configuring the on-prem cluster to mirror the topics to the GCE cluster. This method leverages Kafka's mirroring functionality to replicate data into the Google Cloud environment without the need for Kafka Connect plugins on the on-prem cluster. Once the data is in the GCE-based Kafka cluster, scalable data processing services like Dataproc or Dataflow can be used to read from Kafka and write to Google Cloud Storage (GCS) for analysis in BigQuery or further processing. This approach adheres to the requirement of avoiding Kafka Connect plugins on the on-prem cluster.

Explanation:

Comments (0)

No comments yet.

How can you replicate web application log data from an on-prem Apache Kafka cluster to Google Cloud for analysis in BigQuery and Cloud Storage, without deploying Kafka Connect plugins?

Real Exam

Install the Pub/Sub Kafka connector on the on-prem Kafka cluster and configure Pub/Sub as a Sink connector. Use a Dataflow job to read from Pub/Sub and write to GCS.

17.3%

Create a Kafka cluster on GCE VM instances with the Pub/Sub Kafka connector configured as a Sink connector. After that, use either a Dataproc cluster or Dataflow job to read from Kafka and write to GCS.

13.5%

Set up a Kafka cluster on GCE VM instances and configure the on-prem cluster to mirror the topics to the GCE cluster. Then use either a Dataproc cluster or Dataflow job to read from Kafka and write to GCS.

53.8%

Install the Pub/Sub Kafka connector on the on-prem Kafka cluster and configure Pub/Sub as a Source connector. Use a Dataflow job to read from Pub/Sub and write to GCS.

15.4%