February 3rd, 2022
H2O.ai releases new H2O MLOps features that improves the explainability, flexibility and configuration of machine learning workflows.RSS Share Category: H2O AI Cloud, MLOps
By: Abhishek Mathur
H2O.ai now provides data scientists and machine learning (ML) engineers even more powerful features that give greater control, governance, and scalability within their machine learning workflow – all available on our H2O AI Cloud. Now, H2O MLOps enables you to:
Deploy model explanations in production
Explainability is core to understanding ML model behavior and to deepen adoption of the model within an organization. As a market leader in Explainable AI and Machine Learning Interpretability, H2O.ai provides the most comprehensive suite of tools for explanations during model training time. Now, we are bringing the same powerful capabilities to models that have been deployed to serving infrastructure. Customers will be able to receive back the Shapley Values for each user request that comes in, indicating how much impact each feature had on the model prediction.
Configure infrastructure for deployments
Our customers have some of the most sophisticated IT departments in the world. Accordingly, they are looking for greater control over the infrastructure in which their workloads are running on. Especially within the context of machine learning, where the workloads could be quite large and mission critical, customers are looking for more control. Now, customers are able to configure the following parameters within their Kubernetes cluster:
- Node Affinity and Tolerance: customers can set preference for the nodes (or machines) that should run the actual deployment. This becomes powerful when the model size is large and/or requires ultra low latency, customers are able to select specific machines that are faster (e.g. GPUs) for those models and deployments.
- Resource Requests & Limits: customers can specify the minimum and maximum limits for Memory and CPU to be allocated for a deployment. This can ensure that a deployment can have at least a certain amount of allocation and/or a deployment does not exceed a certain amount.
- Replicas: Customers can select up to 5 nodes to replicate their deployment on. Traffic will be load balanced automatically across the nodes, enabling high availability. Resource limits and node affinity configuration will be replicated in each of the nodes
Enhanced support for 3rd party model frameworks
Our vision is to build the most open and interoperable machine learning platform. Hence, we had support for 3rd party model frameworks (e.g. pyTorch, TensorFlow, scikit-learn, XGBoost, etc.) for quite some time now. This required for customers to package their models using MLflow, and then directly import them into H2O MLOps. However, this added an additional procedural and technological step for our customers, and we wanted to make this step seamless. Now, customers are able to import their Python Pickle files directly into H2O MLOps, without depending on packaging from any other tool.
Enhanced model management capabilities
A critical part of building an enterprise-grade MLOps tool is to provide a robust place to manage, register, and version machine learning models. We are now introducing this capability natively to our customers, available through a UI and through an API. Customers can use MLOps Experiments as their central repository to store, manage, and collaborate on their experiments. Customers can then register their experiments as models using MLOps Model Registry, for the models that will be deployed. Customers can group new versions of a model together, using MLOps Model Versioning.
Simpler and sleeker user interface
We package all of these new features in a brand new user interface that makes navigation much simpler, and allows our customers to accomplish their goal in a much quicker timeframe. This interface also sets the foundation for a whole bunch of other features that are currently in development that simplifies your end to end MLOps experience.