
Ultimate access to all questions.
You are planning to transition an existing on-premises Hadoop infrastructure to Cloud Dataproc. The current setup mainly utilizes Hive, and the data is stored in Optimized Row Columnar (ORC) format. All ORC files have already been transferred to a Cloud Storage bucket. In order to enhance performance, you need to duplicate certain data into the cluster’s local Hadoop Distributed File System (HDFS). What are two methods to begin working with Hive on Cloud Dataproc? (Choose two.)
A
Run the gsutil utility to transfer all ORC files from the Cloud Storage bucket to HDFS. Mount the Hive tables locally.
B
Run the gsutil utility to transfer all ORC files from the Cloud Storage bucket to any node of the Dataproc cluster. Mount the Hive tables locally.
C
Run the gsutil utility to transfer all ORC files from the Cloud Storage bucket to the master node of the Dataproc cluster. Then run the Hadoop utility to copy them to HDFS. Mount the Hive tables from HDFS.
D
Leverage Cloud Storage connector for Hadoop to mount the ORC files as external Hive tables. Replicate external Hive tables to the native ones.
E
Load the ORC files into BigQuery. Leverage BigQuery connector for Hadoop to mount the BigQuery tables as external Hive tables. Replicate external Hive tables to the native ones.