Google Professional Data Engineer

Google Professional Data Engineer

Get started today

Ultimate access to all questions.


You are responsible for implementing security best practices within your data pipeline system. At present, you manually execute jobs with Project Owner privileges. Your goal is to automate these tasks, which involve handling nightly batch files that contain non-public information stored in Google Cloud Storage. These files need to be processed using a Spark Scala job on a Google Cloud Dataproc cluster, and the resulting output should be deposited into Google BigQuery. What steps should you take to securely and efficiently automate this workload?