Return to page

H2O.ai WIKI

Machine Learning Algorithms

What are Machine Learning Algorithms?

A machine learning algorithm is a method where the artificial intelligence system conducts a task of predicting output values from given input data. The two main tasks in supervised machine learning algorithms are classification and regression, while the main tasks in unsupervised machine learning are clustering, dimensionality reduction and anomaly detection.

What are Common Machine Learning Algorithms?

Below is a list of common machine learning algorithms:

  • Linear Regression

  • Logistic Regression

  • Decision Tree

  • Support Vector Machines

  • Naive Bayes

  • Nearest Neighbors

  • K-Means Clustering

  • Dimensionality Reduction

  • Random Forest

  • Gradient Boosting

  • Neural Networks

  • Ensembles of ML algorithms

What are Machine Learning Algorithms Used For?

Machine learning algorithms analyze data and create output values. The goal of each algorithm is to optimize a specific loss function and to improve its performance over time. More data usually improves the performance. The final state of the algorithm (the model) contains key insights about the data and forms the basis of the artificial intelligence.

Why are Machine Learning Algorithms Important?

Algorithms are core to machine learning solutions. Data scientists use algorithms as building blocks for more efficient problem-solving. A major benefit of machine learning processes is that they are time and cost-effective solutions that are executed with little or no human intervention.

Machine Learning Algorithms FAQs

How are machine learning algorithms developed and used?

Machine learning algorithms are the building blocks for obtaining insights from data. Many of them are based on common statistical principles. Many are developed with the aid of previously existing algorithms. Their ability to generalize with speed and accuracy is what determines their success in the field. For a given use case, certain ML algorithms are better suited than others. For example, for churn prediction, a supervised algorithm like gradient boosting might work best, while for document summarization, a neural network trained on a large corpus of similar text is ideal.

How do I build a machine learning model for a given problem?

Here are some steps to consider that can help guide your build:

  • Determine the goal of your model (Classification? Regression? Outlier detection?)

  • Access historical and current data (check for data leakage)

  • Choose the right algorithms (GBMs? Neural Networks? Linear models?)

  • Choose the right loss functions and metrics (Robust to outliers?)

  • Create baseline models and establish a validation scheme

  • Debug the models, inspect residuals (is there leakage?)

  • Create internal leaderboards, compare various models

  • Validate and visualize your results (use a test set, A/B testing)

  • Deploy your model to production

Which algorithm is best for machine learning?

The answer to the question varies depending on many factors, including:

  • Size, quality, and nature of data (both target and explanatory variables)

  • How interpretable the final model needs to be

  • The available computational time

  • The urgency of the task

  • What do you intend to do with the data?

H2O AI Cloud supports the following supervised algorithms:

H2O supports the following unsupervised algorithms:

Machine Learning Algorithms vs Other Technologies & Methodologies

Machine learning algorithms vs models

A machine learning algorithm is a process that is run on data to create a machine learning model. A model represents what was learned by the algorithm.

Machine learning algorithms vs artificial intelligence

Machine learning is a subset of artificial intelligence.

Machine learning algorithms vs deep learning algorithms

Machine learning is an application of artificial intelligence that includes algorithms that parse data, learn from that data, and then apply what they have learned onto new or unseen data to make predictions that inform decisions.

 A traditional machine learning algorithm can be as simple as linear regression. By way of example, imagine you want to predict your income given your years of higher education. The first step is to define a function, e.g. income = y + x * years of education. Next, give the algorithm a set of training data. Then, let your algorithm draw the line, e.g. through an ordinary least squares (OLS) regression. Now, you can give the algorithm some test data, e.g. your personal years of higher education, and let it predict your income.

Deep learning is a subfield of machine learning that builds algorithms in layers to create an artificial neural network that can learn and make decisions on its own. Deep learning is a way to describe algorithms that analyze data with a logic structure similar to how a human mind would make conclusions. To achieve this, deep learning applications use an artificial neural network (ANN) which is a layered structure of algorithms.

Resources