Make with Recap: Validation Scheme Best Practices
by Blair Averett August 23, 2022 Data Science Kaggle Machine Learning Make with

Data Scientist and Kaggle Grandmaster, Dmitry Gordeev, presented at the Make with session on validation scheme best practices, our second accuracy masterclass. The session covered key concepts, different validation methods, data leaks, practical examples, and validation and ensembling.  Key Concepts While the validation topics covered are applicable to most models, the session focused on […]

Data Science with An Introduction to Machine Learning and Predictive Modeling
by h2oai March 16, 2022 Data Science H2O H2O AI Cloud Machine Learning

Our own Jonathan Farland recently recorded a talk about machine learning and predictive modeling. In his talk, Jon also gave an overview of open source H2O and H2O AI Cloud. This video is a great resource for getting up to speed with the latest technology from H2O in half an hour. Some of you may […]

Shapley Values – A Gentle Introduction
by h2oai January 11, 2022 Data Science Shapley Technical Posts

If you can’t explain it to a six-year-old, you don’t understand it yourself. – Albert Einstein One fear caused by machine learning (ML) models is that they are blackboxes that cannot be explained. Some are so complex that no one, not even domain experts, can understand why they make certain decisions. This is of particular […]

1st Place Winner’s Blog – Kaggle 2021 Data Science and Machine Learning Survey
by h2oai January 4, 2022 Data Journalism Data Science Kaggle

Kaggle, the largest global community of data scientists, conducted the 5th annual industry-wide survey that presented a truly comprehensive view of the state of data science and machine learning. A total of 25,973 responses were collected from participants from over 60 countries. Kaggle also launched the Data Science Survey Challenge in which the goal was […]

Amazon Redshift Integration for Model Scoring
by Mary Beth Moore November 22, 2021 Data Science H2O AI Cloud

We consistently work with our partners on innovative ways to use models in production here at, and we are excited to demonstrate our AWS Redshift integration for model scoring. Amazon Redshift is a very popular data warehouse on AWS. We wanted to expand on the existing capacities of using data from Redshift to train […]

What does it take to win a Kaggle competition? Let’s hear it from the winner himself.
by h2oai June 14, 2021 Data Science Kaggle Makers

In this series of interviews, I present the stories of established Data Scientists and Kaggle Grandmasters at, who share their journey, inspirations, and accomplishments. These interviews are intended to motivate and encourage others who want to understand what it takes to be a Kaggle Grandmaster. In this interview, I shall be sharing my interaction […]

What it takes to become a World No 1 on Kaggle
by h2oai May 3, 2021 Data Science Kaggle Machine Learning Makers

In conversation with Guanshuo Xu: A Data Scientist, Kaggle Competitions Grandmaster, and a Ph.D. in Electrical Engineering. In this series of interviews, I present the stories of established Data Scientists and Kaggle Grandmasters at, who share their journey, inspirations, and accomplishments. The intention behind these interviews is to motivate and encourage others who want […]

Safer Sailing with AI
by h2oai April 1, 2021 Customer Data Science H2O Machine Learning Interpretability Open Source Uncategorized [EN] Wave

In the last week, the world watched as responders tried to free a cargo ship that had gone aground in the Suez Canal. This incident blocked traffic through a waterway that is critical for commerce. While the location was an unusual one, ship collisions, allisions, and groundings are not uncommon. With all the technology that […]

H2O AI Cloud: Democratizing AI for Every Person and Every Organization
by h2oai March 24, 2021 AutoML Data Science H2O AI Cloud H2O Driverless AI ModelOps Wave

Harnessing AI’s true potential by enabling every employee, customer, and citizen with sophisticated AI technology and easy-to-use AI applications. Democratization is an essential step in the development of AI, and AutoML technologies lie at the heart of it. AutoML tools have played a pivotal role in transforming the way we consume and understand data. Given […]

Using Python’s datatable library seamlessly on Kaggle
by h2oai February 3, 2021 Data Munging Data Science datatable

Managing large datasets on Kaggle without fearing about the out of memory error

