Return to page


Confusion Matrix

What is the Confusion Matrix?

A confusion matrix is a useful machine learning method that allows you to measure recall, precision, accuracy, and AUC-ROC curve. The confusion matrix is a systematic way to allocate the predictions to the original classes to which the data originally belonged. A confusion matrix is also a performance measurement technique for machine learning classification. If you train a machine learning classification model on a dataset, the resulting confusion matrix will show how accurately the model categorized each record and where there might be errors. The matrix rows represent the actual labels contained in the training dataset, and the matrix columns represent the outcomes.


Examples of Confusion Matrix

A confusion matrix aids in measuring performance when an algorithm's output can be classified as positive or negative; yes or no. Each table has four cells, each of which represents a unique mix of expected and actual values. The four possible results are as follows:

  • True Positive (TP): It denotes that a positive prediction was made and then came true. It is sometimes referred to as sensitivity.

  • True Negative (TN): It signifies a negative prediction was made and then came true. It is referred to as specificity.

  • False Positive (FP): Although the prediction was positive, the actual value was negative. It is frequently referred to as a Type-I error.

  • False Negative(FN): Although the forecast was negative, the actual value was positive. It is sometimes referred to as a Type-II error. 


Why is the Confusion Matrix Important?

Confusion matrices reveal when a model consistently confuses two classes, making it simple to determine how reliable a model's results are likely to be. The effectiveness of a classification model, enabling business users to identify which data their model might be unable to accurately categorize. When applying insights or predictions from the model to real-world business choices, this knowledge is important.

For instance, there is a significantly different outcome when a model predicts that a credit investment opportunity would result in default when it really didn't (false positive) than when the lender unintentionally advances a loan that actually results in a default (false negative). The user should use an alternative model or manually tune their current model if they can see from the confusion matrix that their model is likely to produce false negatives for the loan dataset.


How is the Confusion Matrix Used?

A confusion matrix is employed to evaluate the classification models' performance. Now let's examine the four primary factors that are essential to its process.

Accuracy: It is the most commonly used parameter for evaluating a machine learning model. A large probability exists that the ML model will have an accuracy score of 70%, for instance, if 70% of examples are false and just 30% are correct. (TP+TN)/(TP+FP+FN+TN) is the equation to calculate accuracy.

Precision: It is defined as the ratio of true positives to total positives predicted by the machine learning model. Precision is expressed as TP/(TP+FP). This indicator calculates the likelihood of positive prediction to be true.

Recall: Sensitivity or recall is the ratio of the TP to the number of actual positive outcomes. The Recall formula is TP/(TP+FN). This parameter examines the ML model's ability to study the input and identify the real outcome.

F1 Score: The F1 score is calculated using the harmonic mean of recall and accuracy. It is utilized as an overall indication that combines precision and recall. An unbalanced dataset responds well to this harmonic mean's analysis of false positives and false negatives. It can be calculated using the formula 2(p*r)/(p+r), where r stands for recall and p for precision.


Confusion Matrix vs Other Technologies & Methodologies

Confusion matrix vs correlation matrix

A confusion matrix is a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning. A correlation matrix is a table showing correlation coefficients between variables.

Confusion matrix vs cost matrix

A confusion matrix measures accuracy, which is the ratio of correct predictions to the total number of predictions. A cost matrix is used to specify the relative importance of accuracy for different predictions.

Confusion matrix vs AUC

AUC shows how successful a model is at separating positive and negative classes. A confusion matrix is not a metric to evaluate a model; rather, it provides insight into the predictions.


confusion-matrix confusion-matrix

Confusion Matrix Terms to Understand 

Precision - The number of accurately predicted positive values is explained by precision. Or, put another way, it shows the number of correct outputs the model gave considering all the positive values it correctly predicted. It determines whether or not a model is trustworthy. It is beneficial in situations where the risk of a false positive is greater than that of a false negative.

Recall - The number of actual positive values that the model correctly predicted is referred to as recall. 

Accuracy - The ratio of the number of correct predictions made by the classifier to the total number of predictions made by the classifiers is one of the significant parameters in determining the accuracy of the classification problems. It explains how frequently the model predicts the correct outputs.

F-Measure - When two models have low precision but high recall or vis versa, it is hard to compare them. To circumnavigate this, an F-score can be used. Both recall and precision can be evaluated simultaneously by calculating the F-score.

Null error rate - The null error rate indicates how frequently the model is incorrect in conditions where it always predicted the majority class. 

Receiver Operating Characteristic (ROC) Curve - The classifier’s performance for all desirable thresholds can be seen in this graph. Additionally, a graph is drawn between the true positive and the false positive rate on the x-axis.

Area Under the Curve (AUC) - It measures a binary classification model’s unique potential. The likelihood that an actual positive value will be specified with a higher probability than an actual negative value increases when the value of the AUC is high. 

Misclassification rate - Explains the error rate, or how often the mode gives wrong predictions. However, the ratio of the number of incorrect predictions to the total number of predictions made by the classifier can be used to calculate the error rate.

Cohen’s Kappa - This shows how well the classifier did in comparison to how well it would have done by chance on its own. In other words, a model will have a high Kappa score if the null error rate and accuracy are significantly different. 


Benefits of Confusion Matrix

The following are ways a confusion matrix can be beneficial.

  • It details the classifier’s errors as well as the kinds of errors that are occurring.

  • It shows how predictions are made by a classification model that is disorganized and confused.

  • This feature helps overcome the drawbacks of relying solely on classification accuracy.

  • It is utilized in situations where one class dominates over others and the classification problem is profoundly imbalanced.

  • The recall, precision, specificity, accuracy, and AUC-ROC curve can all be calculated using the confusion matrix to great success.


A confusion matrix is an exceptional method for evaluating a classification model. Depending on the data that is fed into the model, precise insight is provided regarding if the classes have been correctly or incorrectly classified.


H2O Driverless AI includes classification metric plots. The Confusion Matrix is one of the included metric plots. In the Confusion Matrix graph, the threshold value defaults to 0.5. For binary classification experiments, users can specify a different threshold value. The threshold selector is available after clicking on the Confusion Matrix and opening the enlarged view. When you specify a value or change the slider value, Driverless AI automatically computes a diagnostic Confusion Matrix for that given threshold value.