
Answer-first summary for fast verification
Answer: Use a single COPY command to load the data into the Redshift cluster.
Option D is CORRECT because using a single COPY command to load data into Amazon Redshift is the most efficient way to achieve high throughput and use cluster resources optimally. The COPY command is optimized for parallel data loading and can load data from multiple files stored in Amazon S3 simultaneously. When using a single COPY command with multiple files, Redshift can automatically distribute the data across the cluster nodes, leveraging its parallel processing capabilities. This approach maximizes throughput and minimizes the time required to ingest large volumes of data.
Author: Ritesh Yadav
Ultimate access to all questions.
Question 18/58
A company is using Amazon Redshift to build a data warehouse solution. The company is loading hundreds of files into a fact table that is in a Redshift cluster.
The company wants the data warehouse solution to achieve the greatest possible throughput. The solution must use cluster resources optimally when the company loads data into the fact table.
Which solution will meet these requirements?
A
Use multiple COPY commands to load the data into the Redshift cluster.
B
Use S3DistCp to load multiple files into Hadoop Distributed File System (HDFS). Use an HDFS connector to ingest the data into the Redshift cluster.
C
Use a number of INSERT statements equal to the number of Redshift cluster nodes. Load the data in parallel into each node.
D
Use a single COPY command to load the data into the Redshift cluster.
No comments yet.