Return to page

H2O.ai WIKI

Classification

What is Classification?

Classification is one of the primary uses of data science and machine learning. It is a systematic grouping of observations into categories. Examples could include when scientists categorize plants and animals into different taxonomies.
 

Examples of Classification

Below is a list of four types of classification and a brief example of each.

Binary Classification: Classification tasks that contain two class labels, such as:

  • Email spam detection: spam or not
  • Churn prediction: churn or not

Multi-Class Classification: multi-class classification refers to classification tasks that have more than two class labels, such as:

  • Plant species classification
  • Face classification

Multi-Label Classification: refers to classification tasks that have two or more class labels, where one or more class labels may be predicted for each example.

Imbalanced Classification: refers to classification tasks where the number of examples in each class is unequally distributed, such as:

  • Medical diagnostic tests
  • Fraud detection

Why is Classification Important?

Machine learning classification has many business applications. Take for example banking loans and the likelihood that someone will default on a loan. If you want to determine the chances of someone defaulting on a loan, you need to determine if that person belongs to one of the two classes with similar characteristics: the default class or the non-defaulter class. This classification method helps banks understand how likely the person is to default on the loan and helps banks adjust their risk assessment.

Classification FAQs

How many types of classification are there?

The answer is that can be many. However, four common types of classification include:

  • Binary Classification
  • Multi-Class Classification
  • Multi-Label Classification
  • Imbalanced Classification

What is classification in machine learning?

Machine learning classification refers to a predictive modeling problem where a class label is predicted for a given example of input data.

What is data science classification?

Classification in data science refers to a process that tags and categorizes any kind of data so that it can be better understood and later analyzed.

What are classification algorithms?

A classification algorithm is a function that weighs the input features so that the output separates one class into positive values and another into negative values.

H2O.ai and Classification: H2O-3 calculates regression metrics for classification problems.   The following additional evaluation metrics are available for classification models:

H2O Driverless AI is currently targeting common regression, binomial classification, and multinomial classification applications including loss-given-default, probability of default, customer churn, campaign response, fraud detection, anti-money-laundering, and predictive asset maintenance models.

Read more

Classification vs Other Methodologies

Classification vs regression

Classification is the task of predicting a discrete class label. Regression is the task of predicting a continuous quantity.

Classification vs clustering

Classification uses predefined classes in which objects are assigned. Clustering identifies similarities between objects, which it groups according to those characteristics in common which also differentiate them from other groups of objects. These groups are known as clusters.

Classification vs prediction

Classification is the process of identifying the category or class label of the new observation to which it belongs. Prediction is the process of identifying the missing or unavailable numerical data for a new observation.

Classification Resources