
Ultimate access to all questions.
In the context of data visualization for machine learning projects, you are tasked with selecting the most effective plot type to visualize the distribution of a single continuous variable. The dataset in question is large, with a wide range of values, and you are particularly interested in understanding the underlying distribution's shape, central tendency, and spread. Given these requirements, which of the following visualization techniques would you choose? (Choose one correct option.)
A
Box plot: This plot provides a summary of the distribution through quartiles and median, and can also highlight outliers. However, it may not fully capture the shape of the distribution for large datasets.
B
Line plot: While excellent for showing trends over time or across categories, this plot type is not suitable for visualizing the distribution of a single continuous variable.
C
Histogram: By dividing the variable's range into bins and counting the observations per bin, this plot effectively illustrates the distribution's shape, central tendency, and spread, making it ideal for large datasets.
D
Scatter plot: This plot is best used for examining the relationship between two numerical variables, not for visualizing the distribution of a single variable.