
Answer-first summary for fast verification
Answer: When they are working interactively with a small amount of data
Single-node clusters are ideal for interactive work with small datasets because they have lower overhead and faster startup times compared to multi-node clusters. They are not suitable for: - Automated reports that need quick refresh (multi-node clusters are better for performance) - Working with SQL in Databricks SQL (typically uses multi-node clusters for better performance) - Concerns about automatic scaling with larger data (single-node clusters don't scale) - Manual reports with large amounts of data (multi-node clusters are needed for distributed processing) Single-node clusters are cost-effective for development, testing, and small-scale interactive work where the data fits within the memory of a single machine.
Author: Keng Suppaseth
Ultimate access to all questions.
No comments yet.
Which of the following describes a scenario in which a data engineer will want to use a single-node cluster?
A
When they are working interactively with a small amount of data
B
When they are running automated reports to be refreshed as quickly as possible
C
When they are working with SQL within Databricks SQL
D
When they are concerned about the ability to automatically scale with larger data
E
When they are manually running reports with a large amount of data