Return to page

H2O.ai WIKI

Confusion Matrix

What is a Confusion Matrix?

A confusion matrix is a useful machine learning method that allows you to measure recall, precision, accuracy, and AUC-ROC curve. A confusion matrix is also a performance measurement technique for machine learning classification. If you train a machine learning classification model on a dataset, the resulting confusion matrix will show how accurately the model categorized each record and where there might be errors. The matrix rows represent the actual labels contained in the training dataset, and the matrix columns represent the outcomes.

Examples of a Confusion Matrix

Below are examples to explain the terms true positive, true negative, false negative, and true negative.

  • True positive: You predicted a positive outcome and the prediction is true
  • True negative: You predicted a negative outcome and that prediction is true
  • False positive: You predicted a positive outcome and the prediction is false.
  • False negative: You predicted a negative outcome and the prediction is false.

Why is a Confusion Matrix Important?

A confusion Matrix can help to determine how accurate a model’s outcomes are likely to be by exposing when the model is repeatedly confusing two classes. They evaluate the performance of a classification model, allowing the user to determine which data their model may be unable to classify correctly.

Confusion Matrix FAQs

What is a confusion matrix for binary classification?

A confusion matrix of binary classification is a two-by-two table formed by counting the number of the four outcomes of a binary classifier. They are usually denoted true positive, false positive, true negative, and false negative.

What is TP TN FP FN?

These acronyms stand for true positive (TP), true negative (TN), false positive (FP), and false negative (FN).

Is a confusion matrix only for binary classification?

Not always. A confusion matrix is used for binary variables, but is also used for variables that can take more than two values such as High/Medium/Low, Cat/Lion/Tiger, etc.

H2O.ai and Confusion Matrix: In H2O, the actual results display in the rows and the predictions display in the columns; correct predictions are highlighted in yellow. In the example below, 0 was predicted correctly 902 times, while 8 was predicted correctly 822 times and 0 was predicted as 4 once.

Training Metrics - Confusion Matrix Training Metrics - Confusion Matrix

H2O Driverless AI includes classification metric plots. The Confusion Matrix is one of the included metric plots. In the Confusion Matrix graph, the threshold value defaults to 0.5. For binary classification experiments, users can specify a different threshold value. The threshold selector is available after clicking on the Confusion Matrix and opening the enlarged view. When you specify a value or change the slider value, Driverless AI automatically computes a diagnostic Confusion Matrix for that given threshold value.

Confusion Matrix vs Other Technologies & Methodologies

Confusion matrix vs correlation matrix

A confusion matrix is a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning. A correlation matrix is a table showing correlation coefficients between variables.

Confusion matrix vs cost matrix

A confusion matrix measures accuracy, which is the ratio of correct predictions to the total number of predictions. A cost matrix is used to specify the relative importance of accuracy for different predictions.

Confusion matrix vs AUC

AUC shows how successful a model is at separating positive and negative classes. A confusion matrix is not a metric to evaluate a model; rather, it provides insight into the predictions.

Confusion Matrix Resources