AWS Certified Cloud Practitioner

Get started today

Ultimate access to all questions.

Deep dive into the quiz with AI chat providers.

We prepare a focused prompt with your quiz and certificate details so each AI can offer a more tailored, in-depth explanation.

An AI practitioner has built a deep learning model to classify the types of materials in images. The AI practitioner now wants to measure the model performance.

Which metric will help the AI practitioner evaluate the performance of the model?

Exam-Like

Community

RRitesh

Last updated: December 8, 2025 at 19:13

Confusion matrix

Correlation matrix

R2 score

Mean squared error (MSE)

Explanation:

Explanation

For classification problems like material type classification in images, the confusion matrix is the most appropriate metric to evaluate model performance.

Why Confusion Matrix is Correct:

Classification Problem: The task involves classifying images into different material types (categorical classes), which is a classification problem.
Comprehensive Evaluation: A confusion matrix provides a complete picture of classification performance by showing:
- True Positives (TP)
- True Negatives (TN)
- False Positives (FP)
- False Negatives (FN)
Derived Metrics: From a confusion matrix, you can calculate various performance metrics:
- Accuracy
- Precision
- Recall (Sensitivity)
- F1-Score
- Specificity

Why Other Options are Incorrect:

B. Correlation matrix: Used to measure relationships between variables, not for classification model evaluation.
C. R2 score: Used for regression problems to measure how well the model explains variance in continuous data.
D. Mean squared error (MSE): Primarily used for regression problems to measure average squared differences between predicted and actual continuous values.

Key Takeaway:

For classification models, confusion matrix and derived metrics (precision, recall, F1-score) are standard evaluation tools, while regression metrics (R2, MSE) are inappropriate for categorical classification tasks.

Powered ByGPT-5.2

Comments

Loading comments...