
Ultimate access to all questions.
In the context of preparing a dataset for machine learning, you are tasked with identifying the most effective method to visualize the distribution of the dataset and detect any potential outliers. The dataset contains numerical values with varying scales and you need to ensure that the visualization method chosen can also facilitate a comparison across different groups within the dataset. Considering the need for a method that is both efficient in terms of computational resources and easy to interpret, which of the following visualization techniques would you choose? (Choose one correct option)
A
Scatter plot, as it directly shows the relationship between two variables and can highlight outliers.
B
Line graph, to track changes over intervals or time and identify anomalies.
C
Box plot, to display the distribution of the dataset, identify outliers, and compare distributions across groups.
D
Histogram, to show the frequency distribution of a single variable and detect skewness.