Ultimate access to all questions.
Your organization is in the process of updating its current on-premises data strategy. Currently, the infrastructure includes: • Apache Hadoop clusters that process multiple large datasets with on-premises Hadoop Distributed File System (HDFS) for data replication. • Apache Airflow that orchestrates hundreds of ETL pipelines with thousands of job steps.
You need to establish a new architecture in Google Cloud capable of managing your existing Hadoop workloads while necessitating minimal alterations to your current orchestration processes. What should you do?