Return to page

WIKI

AUC-ROC

What is AUC-ROC?

AUC-ROC is a performance metric used in machine learning to evaluate binary classification models. The full form of AUC-ROC is Area under the Receiver Operating Characteristic Curve. This metric helps in evaluating the ability of a model to distinguish between positive and negative classes. AUC-ROC is a probability curve that plots the True Positive Rate (TPR) against the False Positive Rate (FPR) at different thresholds. This metric ranges from 0 to 1, where 0 indicates a poor model and 1 indicates a perfect model.

How AUC-ROC Works

When we build a binary classification model, it produces a probability between 0 and 1 for each observation. The threshold for classifying an observation as positive or negative is usually set to 0.5, which means that any observation with a probability greater than 0.5 is classified as positive, and any observation with a probability less than or equal to 0.5 is classified as negative. However, this threshold can be adjusted to change the tradeoff between false positives and false negatives based on the specific needs of the business.

 

For each threshold, we plot the true positive rate (TPR) against the false positive rate (FPR), and the resulting curve is called the receiver operating characteristic (ROC) curve. AUC-ROC calculates the area under this curve, which ranges from 0 to 1 and provides a single number to summarize the performance of the model across all possible thresholds.

Other technologies or terms that are closely related to AUC-ROC

Other key terms that are closely related to AUC-ROC include:

  • Receiver Operating Characteristic (ROC) curve

  • True Positive Rate (TPR)

  • False Positive Rate (FPR)

  • Precision

  • Recall

AUC-ROC vs PR curve

Another performance metric used in machine learning for binary classification models is the Precision-Recall (PR) curve. While AUC-ROC is a good metric for evaluating models with balanced classes, the PR curve is useful when the class distribution is highly imbalanced. The PR curve plots the precision (positive predictive value) against the recall (sensitivity) at different classification thresholds. The area under the PR curve (AUC-PR) provides a summary of the model's performance across all possible classification thresholds.

When to use AUC-ROC and when to use accuracy?

Accuracy is a good general-purpose measure for evaluating classification models. However, when the class distribution is imbalanced, AUC-ROC is a better performance metric. This is because accuracy can be misleading in such cases and can result in a high score even when the model performs poorly on the minority class. AUC-ROC is a more robust measure because it provides an estimate of the model's performance across all possible classification thresholds.