H2O.ai Blog
Filter By:
12 results Category: Year:Explaining models built in H2O-3 — Part 1
Machine Learning explainability refers to understanding and interpreting the decisions and predictions made by a machine learning model. Explainability is crucial for ensuring the trustworthiness and transparency of machine learning models, particularly in high-stakes situations where the consequences of incorrect predictions can be signi...
Read moreIntroducing DatatableTon - Python Datatable Tutorials & Exercises
Datatable is a python library for manipulating tabular data. It supports out-of-memory datasets, multi-threaded data processing and has a flexible API.If this reminds you of R’s data.table , you are spot on because Python’s datatable package is closely related to and inspired by the R library.The release of v1.0.0 was done on 1st July,...
Read moreIntroducing H2O Wave
For almost a decade, H2O.ai has worked to build open source and commercial products that are on the leading edge of innovation in machine learning, from AutoML to Explainable AI . We are thrilled to announce the release of what we believe to be the future of AI Applications: H2O Wave . Wave is an open source, lightweight Python developmen...
Read moreSummary of a Responsible Machine Learning Workflow
A paper resulting from a collaboration between H2O.AI and BLDS, LLC was recently published in a special “Machine Learning with Python” issue of the journal, Information (https://www.mdpi.com/2078-2489/11/3/137). In “A Responsible Machine Learning Workflow with Focus on Interpretable Models, Post-hoc Explanation, and Discrimination Testing...
Read moreBlink: Data to AI/ML Production Pipeline Code in Just a Few Clicks
You have the data and now want to build a really really good AI/ML model and deliver to production. There are three options available today: Write the code yourself in a Jupyter notebook/R Studio etc., for training/validation and dev-ops model handoff. You decided to do the feature engineering also. Build your own features like above,...
Read moreParallel Grid Search in H2O
H2O-3 is, at its core, a platform for distributed, in-memory computing. On top of the distributed computation platform, the machine learning algorithms are implemented. At H2O.ai, we design every operation, be it data transformation, training of machine learning models or even parsing to utilize the distributed computation model. In ord...
Read moreAn Overview of Python’s Datatable package
This blog originally appeared on Towardsdatascience.com “There were 5 Exabytes of information created between the dawn of civilization through 2003, but that much information is now created every 2 days”: Eric Schmidt If you are an R user, chances are that you have already been using the data.ta...
Read moreH2O New Year releases
There were two releases shortly after each other. First, on December 21st, there was a minor (fix) release 3.22.0.3 . Immediately followed by a more major release (but still on 3.22 branch) codename Xu, named after mathematician Jinchao Xu , whose work is focused on deep neural networks, besides many other fields of research.Of course, th...
Read moreHow This AI Tool Breathes New Life Into Data Science
Ask any data scientist in your workplace. Any Data Science Supervised Learning ML/AI project will go through many steps and iterations before it can be put in production. Starting with the question of “Are we solving for a regression or classification problem?” Data Collection & Curation Are there Outliers? What is the Distribu...
Read moreStacked Ensembles and Word2Vec now available in H2O!
Prepared by: Erin LeDell and Navdeep Gill MathJax.Hub.Config({ tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]} }); Stacked Ensembles ensemble <- h2o.stackedEnsemble(x = x, y = y, training_frame = train, base_models = my_models) Python:ensemble = H2OStackedEnsembleEstimator(base_models=my_models) ensemble.train(x=x, y=y, training...
Read moreCreating a Binary Classifier to Sort Trump vs. Clinton Tweets Using NLP
The problem : Can we determine if a tweet came from the Donald Trump Twitter account (@realDonaldTrump) or the Hillary Clinton Twitter account (@HillaryClinton) using text analysis and Natural Language Processing (NLP) alone? The Solution : Yes! We’ll divide this tutorial into three parts, the first on how to gather the necessary data, t...
Read moreA Newbie's Guide to H2O in Python - Guest Post
This blog was originally posted hereI created this guide to help fellow newbies get their feet wet with H2O, an open-source predictive analytics platform that is fast, powerful, and easy to use. Using a combination of extraordinary math and high-performance parallel processing, H2O allows you to quickly create models for big data. The st...
Read more