H2O AutoML
Scalable AutoML in H2O-3 Open Source
Overview
AutoML or Automatic Machine Learning is the process of automating algorithm selection, feature generation, hyperparameter tuning, iterative modeling, and model assessment. AutoML makes it easy to train and evaluate machine learning models. Automating repetitive tasks allows people to focus on the data and the business problems they are trying to solve.
H2O Open Source AutoML
- Train the best model in the least amount of time to save human hours, using a simple interface in R, Python, or a web GUI.
- Reduce the need for expertise in machine learning by reducing the manual code-writing time.
- Improve the performance of machine learning models.
- Increase reproducibility and establish a baseline for scientific research or applications.
- Scales training data set to clusters (Hadoop, Spark, Kubernetes)
Features of AutoML
- Automatic data preprocessing: Imputation, one-hot encoding, standardization.
- Trains random grids of a wide variety of H2O models using an efficient and carefully constructed hyper-parameter spaces.
- Tunes individual models using cross-validation.
- Stacked Ensembles are trained to maximize model performance.
- All models are available and ranked by various metrics in the Leaderboard.
- Models can be be automatically explained using the H2O Explainability module.
- Models can be easily exported to use in production.
No Code AutoML
In addition to the R and Python interfaces to H2O AutoML, the web GUI allows simple click and selection for all of the parameters inside of H2O-3, including AutoML.