Ultimate access to all questions.
In the context of data visualization for machine learning projects, you are tasked with selecting the most effective plot type to visualize the distribution of a single continuous variable. The dataset in question is large, with a wide range of values, and you are particularly interested in understanding the underlying distribution's shape, central tendency, and spread. Given these requirements, which of the following visualization techniques would you choose? (Choose one correct option.)
Explanation:
A histogram is the optimal choice for visualizing the distribution of a single continuous variable, especially in large datasets. It segments the data into bins and displays the frequency of observations within each bin, providing detailed insights into the data's distribution. While a box plot (A) offers a summary of the distribution and identifies outliers, it lacks the detailed view a histogram provides. A line plot (B) is suited for trend analysis, not distribution visualization. A scatter plot (D) is designed to explore relationships between two variables, not the distribution of a single one.