
Explanation:
Amazon Redshift Spectrum performs best with columnar file formats (such as Parquet or ORC) because it minimizes the amount of data read by accessing only the columns needed for the query. Partitioning the data on common query predicates further reduces the amount of data scanned, improving performance and reducing costs.
Ultimate access to all questions.
Question 42 A company is building an analytics solution. The solution uses Amazon S3 for data lake storage and Amazon Redshift for a data warehouse. The company wants to use Amazon Redshift Spectrum to query the data that is in Amazon S3. Which actions will provide the FASTEST queries? (Choose two.)
A
Use gzip compression to compress individual files to sizes that are between 1 GB and 5 GB.
B
Use a columnar storage file format.
C
Partition the data based on the most common query predicates.
D
Split the data into files that are less than 10 KB.
E
Use file formats that are not splittable.
No comments yet.