Return to page

H2O Quick Start with Python

Users can now utilize H2O Flow, a notebook-style graphical user interface with command-line computing, to access fast, scalable H2O algorithms. Join the Movement: open source machine learning software from, go to Github repository



Talking Points:



Amy Wang, Data Scientist / Sales Engineer,

Read the Full Transcript



Installation of H2O Module in Python

Amy Wang:

This tutorial will walk you through the installation of the H2O module in Python. The prerequisite for this walkthrough includes an installation of Java version 1.6 or newer. Python version 2.7 or newer, and an installation of pip. To start, navigate to our website at on your web browser. Click on the download link, which will bring you to the downloads page. Scroll down to the latest stable release of H2ODev. Hit the third tab on the menu for instructions on how to install in Python. Open up a terminal, and then run pip install on the dependencies H2O need for the Python module that it is built on top of. You might need to run pseudo pip install rather than pip install depending on your environment. Once you've installed the dependencies, you're going to have to do a pip uninstall of any previous H2O modules you have installed. Finally, copy and paste the link on the website that will allow you to pip install the latest H2O module. Once you're finished with the installation, we can move on to the demonstration part of the tutorial. For this part, you're going to have to have IPython Notebook installed. To run an example in IPython, pull a prostate demo from our GitHub repository, as well as a prostate dataset necessary to run the Notebook. To download the Notebook to a wget, to get the dataset again to a wget of the prostate dataset sitting in s3. 

Run IPython Notebook, and then open up the most recently downloaded prostate underscore GLM IPython Notebook. From here, the only thing you'll need to change is the path of the dataset that we just downloaded. Once that is done, you can execute each of the cells using control enter. Once you have the model output, you have finished running H2O in Python.