October 18th, 2021
New Features Now Available with the Latest Release of the H2O AI Cloud 21.10RSS Share Category: H2O AI Cloud, H2O Release
By: Mary Beth Moore
The Makers here at H2O.ai have been busy building new features and enhancing capabilities across our AI platform. Designed to support our core mission of democratizing AI, these additions to our platform simplify the ability to make AI you can trust, operate it efficiently and innovate with ready-made AI applications.
Launched in January of 2021, the H2O AI Cloud brings all of H2O.ai’s products together on a single, unified platform. The H2O AI Cloud enables data scientists to leverage H2O end-to-end suite of data science tools to enable rapid model development, deployment, and scaling in the form of AI applications. The latest H2O AI Cloud release encompasses end-to-end upgrades for model development, machine learning operations, AI applications and overall platform performance.
Make AI you can trust.
H2O.ai has built AI to do AI. Our experts in data science and engineering have taken best practices and delivered them with sophisticated automation and transparency to provide users with more accuracy, speed and transparency in the model building process. The latest release includes:
- Built on top of the latest stable versions of all major open-source packages. Benefit from the latest versions of Python, RAPIDS/CUML, PyTorch, TensorFlow, H2O, XGBoost, LightGBM, datatable, datatable, sklearn, pandas, and many more packages. And gain full control over them and any other Python package with our built-in custom recipe architecture.
- Per-Feature Control. Disable feature engineering and feature selection for certain columns in your dataset, and pass them as-is to the model. Satisfy your compliance requirements.
- Shapley values in production. For each prediction in production, get the contributions of each feature towards the predicted values. Obtain mathematically consistent reason codes for improved actionability and business insights. Shapley values are available in original feature space and in feature-engineered feature space.
- Automatic label assignment. Reduce error rates and save time with automatic labeling that predicts the class for every scored record, in addition to returning the per-class probabilities. Binary classification is done with an optimized threshold, with your choice of metric for optimization (F1, F2, F0.5 or MCC).
- Improved handling of imbalanced multiclass datasets. Improve accuracy for imbalanced datasets with new scoring metrics for multiclass problems. Treat rare classes more equally to frequent classes in problems where recall and precision for instances of rare classes matter to the business.
- Unsupervised machine learning additions. Immediately get new insights on your unlabeled data with unsupervised techniques such as clustering to automatically group topics, outlier detection to identify irregularities in your data, and dimensionality reduction to reduce model overfitting and complexity.
- Leaderboard for forecasting. Save time getting an optimized forecasting model with a new leaderboard mode specific to time series experiments. Automatically design and run multiple experiments with varying amounts of pairs (train-test gap, forecast horizon, etc.) to help with model selection.
- New pre-built cards and components. The goal of the Wave SDK is to allow data scientists and developers to build applications without knowledge of UX design or front end languages. With this in mind, there will regularly be new components and cards added to create the exact experience needed for the end users. This release we have: the persona for showing and referencing an avatar, the side panel for easily showing a user extra details, support to change the width on form items, a icon-only button, and the ability to have inline checkboxes. New cards and components come from H2O developers, customer Wave developers, and the open source community.
- Explainable AI additions:
- Natural language processing: Understand how token importance varies between classes with newly added multinomial support. Understand the impact of tokens on outcomes with LOCO (leave one covariate out) 2.0 and Vectorizer + Linear Model (VLM). Calculate the average outcome of a model when a text token is included versus not included with Partial Dependence for Text Tokens. Select between TF-IDF or the newly added Vectorizer + Linear Model (VLM) when it comes to generating tokens for Surrogate models.
- Time-series support: Time series models now work with various MLI explainers, including Sensitivity Analysis, Disparate Impact Analysis, Partial Dependence/Individual Conditional Expectation, Naïve and Kernel Shapley, Surrogate Models and all feature importance techniques. Navigate through time series model explanations in a brand new UI with user experience as a top priority.
- Improved Shapley user experience: Improved user experience for Shapley Values to help you clearly understand the explanations and contributions from every feature in a model.
In addition to many quality of life upgrades in our machine learning operations (MLOps) capabilities, we now support library and framework agnostic model management, as well as data science team analytics for internal transparency and governance, two core functionalities for the largest data science teams.
- MLOps Upgrades. Management and support for 3rd party models is now available, including scikit-learn, PyTorch, TensorFlow, XGBoost, LightGBM and more. Drag and drop models to deploy H2O.ai and 3rd party models in a single click. Measure and analyze drift, feature importance, model performance and ground truth over time with a brand new model monitoring app.
- Custom Recipe Management. Choose from every custom recipe ever written, or delete old ones. Get fine-grain version control for custom recipes for data preparation, feature engineering, model training and custom metrics and KPIs.
- Experiment Import/Export. Share, exchange and backup experiments between instances and colleagues, for improved collaboration and speed of development.
- Platform Performance API. Visualize and monitor system usage with a publicly available API that provides platform metrics for resource monitoring and autoscaling of H2O AI Cloud multi-node clusters.
- Hardware updates. Easily scale workloads with support for the unprecedented compute and network acceleration of Ampere-based NVIDIA GPUs through the use of the latest CUDA runtime.
- Software updates. Improve speed and accuracy of machine learning pipelines with updates to Python, Torch and TensorFlow.
- High Performance Computing Integrations: Now enabling full NVIDIA RAPIDS integration.
Innovate with ready-made applications.
The H2O AI Cloud AppStore provides access to more than 40 AI applications spanning industry use cases, AI for Good initiatives and data science best practices. The most recent additions to the H2O AppStore are focused on making data science tasks easier to complete and manage.
- AutoDoc App. Provides easy storage and management of automated model documentation for H2O.ai and Scikit learn models.
- Model Validation App. Check for robustness of models with backtesting, adversarial analysis, drift and more.
- Text Labelling App. Automated text labelling for text classification and named entity recognition.