H2O Quick Start on Hadoop

H2O 3.0 is for enterprise and open-source use, with easy installations for Spark, Python, R, Yarn, Hadoop 1, Amazon EC2, Maven, laptops and standalone clusters. Users can now utilize H2O Flow, a notebook-style graphical user interface with command-line computing, to access fast, scalable H2O algorithms.

Speakers:

Video Narrator

Read the Full Transcript

Narrator:

This tutorial walks you through the installation of H2O on your Hadoop server and shows you how to launch a multi noted H2O cluster using MapReduce or YARN. The prerequisite for this walkthrough includes an installation of Java version 1.6 or newer. To start, navigate to our website at h2o.ai on your web browser. Click on the download button, which will land you on the downloads page. Scroll down to the latest H2O dev release. Hit the fourth tab on the top menu for instructions on installing on Hadoop. Depending on your distribution of Hadoop, choose the right installation link. If you are unsure about the version of Hadoop you are using, go to your Hadoop server and type in the Hadoop version. In my case, I am running HTP 2.1, so I would choose the HTP 2.1 zip file. Going back to the box where you run your Hadoop commands, do a Wget of the release. This release will come with an H2O driver that will allow you to launch on Hadoop. Unzip the installation file and Cd into the folder. Going back to our website with our instructions, copy and paste the Hadoop jar command that is necessary to launch an H2O instance.

From the command line, you'll be able to vary the number of nodes that you will launch as well as the size of each node. In our particular case, we're going to launch a cluster with one node and one gig on the node. It is important to point out that you will need to change the output HDFS directory each time you launch H2O. Once the cluster is up, choose any one of the nodes that you've launched and navigate to your web browser to access flow. To gain access to the cluster from R or Python, simply specify any one of the instances, IP address and port on the cluster in H2O.indentfunction, R or Python. You have just finished launching H2O on top of Hadoop.

Generative AI

Predictive AI

Industry Solutions

Use Cases

H2O.ai Hospital Occupancy Simulator

Strategic Transformation

View All Case Studies

FINANCIAL SERVICES

TELECOM

HEALTHCARE

ENERGY

FINANCIAL INDUSTRIES

MARKETING

Partners

Resources

Open Source

Join H2O University

Support

Events

H2O.ai Wiki

Responsible AI

Company

What is an AI Cloud?

2024 Gartner® Magic Quadrant™

H2O Quick Start on Hadoop

Speakers:

Read the Full Transcript

Ready to see the H2O.ai platform in action?

Why H2O.ai

Products

Resources

Insights