Introducing DatatableTon – Python Datatable Tutorials & Exercises
September 20, 2021 datatable Open Source Python TutorialsDatatable is a python library for manipulating tabular data. It supports out-of-memory datasets, multi-threaded data processing and has a flexible API. If this reminds you of R’s data.table, you are spot on because Python’s datatable package is closely related to and inspired by the R library. The release of v1.0.0 was done on 1st July, 2021 and it’s probably […]
Introducing H2O Wave
December 15, 2020 Open Source Product Updates Python WaveFor almost a decade, H2O.ai has worked to build open source and commercial products that are on the leading edge of innovation in machine learning, from AutoML to Explainable AI. We are thrilled to announce the release of what we believe to be the future of AI Applications: H2O Wave. Wave is an open source, […]
Summary of a Responsible Machine Learning Workflow
March 20, 2020 Data Science Deep Learning Machine Learning Machine Learning Interpretability Neural Networks Python Responsible AIA paper resulting from a collaboration between H2O.AI and BLDS, LLC was recently published in a special “Machine Learning with Python” issue of the journal, Information (https://www.mdpi.com/2078-2489/11/3/137). In “A Responsible Machine Learning Workflow with Focus on Interpretable Models, Post-hoc Explanation, and Discrimination Testing,” coauthors, Navdeep Gill, Patrick Hall, Kim Montgomery, and Nicholas Schmidt compare model accuracy […]
Blink: Data to AI/ML Production Pipeline Code in Just a Few Clicks
February 11, 2020 H2O Driverless AI Machine Learning Python TechnicalYou have the data and now want to build a really really good AI/ML model and deliver to production. There are three options available today: Write the code yourself in a Jupyter notebook/R Studio etc., for training/validation and dev-ops model handoff. You decided to do the feature engineering also. Build your own features like above, […]
Parallel Grid Search in H2O
February 4, 2020 Data Science H2O Machine Learning Open Source Python R R-Bloggers Recommendations Technical Technical PostsH2O-3 is, at its core, a platform for distributed, in-memory computing. On top of the distributed computation platform, the machine learning algorithms are implemented. At H2O.ai, we design every operation, be it data transformation, training of machine learning models or even parsing to utilize the distributed computation model. In order to work with big data […]
How H2O propels data scientists ahead of itself: enhancing Driverless AI models with advanced options, recipes and visualizations
January 6, 2020 Data Science H2O Driverless AI Python R RecipesH2O.ai engineers continually innovate and introduce new techniques by adopting latest research, working on cutting edge use cases, and participating in and winning machine learning competitions like Kaggle. But thanks to the explosion of AI research and applications even the most advanced automated machine learning platform like H2O Driverless AI cannot come with all bells and whistles to satisfy every […]
An Overview of Python’s Datatable package
June 4, 2019 Data Science H2O H2O Driverless AI Python Technical Technical PostsThis blog originally appeared on Towardsdatascience.com “There were 5 Exabytes of information created between the dawn of civilization through 2003, but that much information is now created every 2 days”: Eric Schmidt If you are an R user, chances are that you have already been using the data.table package. Data.table is an extension of the data.frame package in R. It’s also […]
H2O New Year releases
January 18, 2019 H2O H2O Release Python RThere were two releases shortly after each other. First, on December 21st, there was a minor (fix) release 3.22.0.3. Immediately followed by a more major release (but still on 3.22 branch) codename Xu, named after mathematician Jinchao Xu, whose work is focused on deep neural networks, besides many other fields of research. Of course, the […]
How This AI Tool Breathes New Life Into Data Science
October 16, 2018 Beginners Data Journalism Data Science Deep Learning Driverless Explainable AI GPU H2O Driverless AI Machine Learning NLP Python R TechnicalAsk any data scientist in your workplace. Any Data Science Supervised Learning ML/AI project will go through many steps and iterations before it can be put in production. Starting with the question of “Are we solving for a regression or classification problem?” Data Collection & Curation Are there Outliers? What is the Distribution? What do […]
Stacked Ensembles and Word2Vec now available in H2O!
February 8, 2017 Data Munging Ensembles H2O Release NLP Python R TechnicalPrepared by: Erin LeDell and Navdeep Gill Stacked Ensembles H2O’s new Stacked Ensemble method is a supervised ensemble machine learning algorithm that finds the optimal combination of a collection of prediction algorithms using a process called stacking or “Super Learning.” This method currently supports regression and binary classification, and multiclass support is planned for a […]