
Ultimate access to all questions.
When the output variable is a binary categorical, a common way to evaluate the model is through calculations based on a confusion matrix, which is a 2 x 2 table showing the possible outcomes and whether the predicted answer was correct. There are four elements in the table, True positive, False negative, False positive, True negative. Based on these four elements, we could specify several performance metrics, the most common of which are:
A
The ROC curve plots the true positive rate on the x-axis against the false positive rate on the y-axis and the points on the curve emerge from varying the decision threshold.
B
The AUC shows pictorially how effective the model has been in separating the data points into clusters, with a higher AUC implying a better model fit, but the AUC cannot be used to compare between models.
C
One possible application of the ROC and AUC would be in the context of comparing models to determine whether a loan application should be rejected or accepted.
D
An AUC of 1 would indicate a perfect fit, whereas a value of 0.1 would correspond with an entirely random set of predictions and therefore a model with no predictive ability.