May 14th, 2019

H2O.ai Automatic Machine Learning on Red Hat OpenShift Container Platform Delivers Data Science Ease and Flexibility at Scale

RSS icon RSS Category: Cloud, Data Science, Demos, H2O Driverless AI

Last week at Red Hat Summit in Boston, Sri Ambati, CEO and Founder, demonstrated how to use our award-winning automatic machine learning platform, H2O Driverless AI, on Red Hat OpenShift Container Platform.  You can watch the replay here.

What we showed not only helps data scientists achieve results, it also enables them to scale their machine learning efforts and easily deploy their models for enterprises. Sri talked about the five easy steps to do automatic machine learning with Driverless AI on Red Hat OpenShift.

  1. Drag and Drop Data: Bring in your data, whether it’s on prem or on cloud, you just drag and drop data from various different sources. H2O Driverless AI has over 10 different connectors including Amazon S3, Google Big Query, Snowflake, HDFS, and more.
  2. Automatic Visualization: Next we can run the data through automatic visualization, which does a number of statistical checks on your data to find some most of the most interesting patterns and helps you fix your data quality as well.
  3. Automatic Machine Learning: After that you run through our automatic machine learning platform, an engine which does automatic feature engineering, modeling and ensembling for you. If you are an expert data scientist, you can select your own algorithms or tweak the parameters, otherwise, it does it all for you.
  4. Automatic Scoring Pipelines: Driverless AI automatically generates a scoring pipeline, that can easily be deployed as a Java or Python object.
  5. Interpret the results: Using Driverless AI machine learning interpretability, it is easy to understand and see the reason codes for why the model that was selected or prediction was made. It also automatically creates a document to walk through the entire workflow and record every step of the process.

Now let’s see how all of these come together on OpenShift. Using the H2O.ai OpenShift templates, we can train a model with one, and with the other deploy it.

The demonstration was focused on determining sentiment analysis which we did on a tweet at the end of the presentation.  We started with a sentiment data set that we had pre-loaded.

We can visualize the data to get a snapshot of how the data set looks.  Auto Visualization, part of Driverless AI, allows us to look at the data in many ways from determining data outliers, to correlations, heat maps and more.

We want to find out whether the sentiment is positive or negative, and the only thing we need to do at this point is to optimize for the accuracy, time and interpretability, the “knobs and dials” that tell us how complex we want the model to be, how much time to train, and how interpretable the model will be.  Driverless AI automatically detected that this is an NLP problem and applied one of our NLP recipes to the problem.

We saw about 93% accuracy, which is pretty good for a problem that was trained on a really small dataset with only 12,000 rows.

You can look at different charts like a ROC curve, lift and gains and also look at the summary quickly. You can see we created about 352 features out of the one feature that was given originally. We only gave the text column to start. Based on the text column, we created word embeddings and those were the ones that were used by the model.

Once the models were built, you can download the scoring pipeline or you can interpret this model. We’ll show how to deploy it.

Using the OpenShift console, and we now have a template to deploy this Mojo (our format for deploying models) that is optimized for low latency.

We had a small web app running and typed in the following to see the sentiment based on the model:

“The Red Hat keynote was beautiful & awesome”

Which resulted in positive sentiment.  This was a fairly easy demonstration with Driverless AI on OpenShift, but it showed how easy and seamless it is to start an instance, build and interpret a model, and finally publish that model to score live data.

It was great to participate at the Red Hat Summit. We enjoyed demonstrating how H2O.ai and Red Hat are working together to democratize AI for the enterprise.

About the Author

vinod iyengar
Vinod Iyengar, VP of Products

Vinod is VP of Products at H2O.ai. He leads all product marketing efforts, new product development and integrations with partners. Vinod comes with over 10 years of Marketing & Data Science experience in multiple startups. He was the founding employee for his previous startup, Activehours (Earnin), where he helped build the product and bootstrap the user acquisition with growth hacking. He has worked to grow the user base for his companies from almost nothing to millions of customers. He’s built models to score leads, reduce churn, increase conversion, prevent fraud and many more use cases. He brings a strong analytical side and a metrics driven approach to marketing. When he is not busy hacking, Vinod loves painting and reading. He is a huge foodie and will eat anything that doesn’t crawl, swim or move.

Leave a Reply

+
Recap of H2O World India 2023: Advancements in AI and Insights from Industry Leaders

On April 19th, the H2O World  made its debut in India, marking yet another milestone

May 29, 2023 - by Parul Pandey
+
Enhancing H2O Model Validation App with h2oGPT Integration

As machine learning practitioners, we’re always on the lookout for innovative ways to streamline and

May 17, 2023 - by Parul Pandey
+
Building a Manufacturing Product Defect Classification Model and Application using H2O Hydrogen Torch, H2O MLOps, and H2O Wave

Primary Authors: Nishaanthini Gnanavel and Genevieve Richards Effective product quality control is of utmost importance in

May 15, 2023 - by Shivam Bansal
AI for Good hackathon
+
Insights from AI for Good Hackathon: Using Machine Learning to Tackle Pollution

At H2O.ai, we believe technology can be a force for good, and we're committed to

May 10, 2023 - by Parul Pandey and Shivam Bansal
H2O democratizing LLMs
+
Democratization of LLMs

Every organization needs to own its GPT as simply as we need to own our

May 8, 2023 - by Sri Ambati
h2oGPT blog header
+
Building the World’s Best Open-Source Large Language Model: H2O.ai’s Journey

At H2O.ai, we pride ourselves on developing world-class Machine Learning, Deep Learning, and AI platforms.

May 3, 2023 - by Arno Candel

Request a Demo

Explore how to Make, Operate and Innovate with the H2O AI Cloud today

Learn More