
Explanation:
Amazon S3 (EMRFS) provides a highly reliable and cost-effective persistent data store compared to HDFS, separating compute and storage. Graviton instances provide better price performance for Spark workloads on Amazon EMR, making them the most cost-effective compute choice for core and task nodes. Using Spot Instances for primary (master) nodes is not recommended for high reliability because if the primary node is interrupted, the cluster terminates.
Ultimate access to all questions.
Question 4 A company is planning to use a provisioned Amazon EMR cluster that runs Apache Spark jobs to perform big data analysis. The company requires high reliability. A big data team must follow best practices for running cost-optimized and long-running workloads on Amazon EMR. The team must find a solution that will maintain the company's current level of performance. Which combination of resources will meet these requirements MOST cost-effectively? (Select TWO.)
A
Use Hadoop Distributed File System (HDFS) as a persistent data store.
B
Use Amazon S3 as a persistent data store.
C
Use x86-based instances for core nodes and task nodes.
D
Use Graviton instances for core nodes and task nodes.
E
Use Spot Instances for all primary nodes.
No comments yet.