October 28th, 2014

Running Your First Droplet on H2O

RSS icon RSS Category: Uncategorized [EN]
TestPassed

A number of us were at Strata in New York City this October, and one of the major benefits of these events is getting lots of in-person time with people who use your product.

Michal and Amy spent some time with a developer who was trying to build on top of the h2o-dev repo, and we realized that we didn’t have a really basic example yet of using an h2o-dev artifact as a dependency in a brand new project.

So Michal put one together for everyone to share, and I’ll walk you through a quick introduction in this post.

1. Cloning the examples repository

The h2o-droplets repository on github contains some very simple starter projects for different languages. Let’s get started by cloning the h2o-droplets repository, and changing to that directory.

$ git clone https://github.com/0xdata/h2o-droplets.git

Cloning into 'h2o-droplets'...
remote: Counting objects: 53, done.
remote: Compressing objects: 100% (33/33), done.
remote: Total 53 (delta 10), reused 39 (delta 0)
Unpacking objects: 100% (53/53), done.
Checking connectivity... done.

$ cd h2o-droplets

2. A quick look at the repo contents

As of this writing, the repo contains a java example and a scala example. Each of these is an independent starter project.

$ ls -al

total 8
drwxr-xr-x   6 tomk  staff   204 Oct 28 08:40 .
drwxr-xr-x  35 tomk  staff  1190 Oct 28 08:40 ..
drwxr-xr-x  13 tomk  staff   442 Oct 28 08:40 .git
-rw-r--r--   1 tomk  staff   322 Oct 28 08:40 README.md
drwxr-xr-x  11 tomk  staff   374 Oct 28 08:40 h2o-java-droplet
drwxr-xr-x  11 tomk  staff   374 Oct 28 08:40 h2o-scala-droplet

Let’s take a closer look at the java example:

$ find h2o-java-droplet -type f

h2o-java-droplet/.gitignore
h2o-java-droplet/build.gradle
h2o-java-droplet/gradle/wrapper/gradle-wrapper.jar
h2o-java-droplet/gradle/wrapper/gradle-wrapper.properties
h2o-java-droplet/gradle.properties
h2o-java-droplet/gradlew
h2o-java-droplet/gradlew.bat
h2o-java-droplet/README.md
h2o-java-droplet/settings.gradle
h2o-java-droplet/src/main/java/water/droplets/H2OJavaDroplet.java
h2o-java-droplet/src/test/java/water/droplets/H2OJavaDropletTest.java

As you can see, the java example contains of a build.gradle file, a java source file, and a java test file.

Look at the build.gradle file and you will see the following sections, which link the java droplet sample project to a version of h2o-dev published in MavenCentral:

repositories {
    mavenCentral()
}
ext {
  h2oVersion = '0.1.8'
}
dependencies {
    // Define dependency on core of H2O
    compile "ai.h2o:h2o-core:${h2oVersion}"
    // Define dependency on H2O algorithm
    compile "ai.h2o:h2o-algos:${h2oVersion}"
    // Demands web support
    compile "ai.h2o:h2o-web:${h2oVersion}"
    // H2O uses JUnit for testing
    testCompile 'junit:junit:4.11'
}

This is all very standard gradle stuff. In particular, note that this example depends on three different H2O artifacts, all of which are built in the h2o-dev repository.

  • h2o-core contains base platform capabilities like H2O’s in-memory distributed key/value store and mapreduce frameworks (the “water” package).
  • h2o-algos contains math algorithms like GLM and Random Forest (the “hex” package).
  • h2o-web contains the browser web UI (lots of javascript).

3. Preparing the example for use in your IDE

Let’s walk through an example using IntelliJ IDEA. The first step is to use gradle to build your IntelliJ project file.

$ cd h2o-java-droplet$ ./gradlew idea

:ideaModule
Download http://repo1.maven.org/maven2/ai/h2o/h2o-core/0.1.8/h2o-core-0.1.8.pom
Download http://repo1.maven.org/maven2/ai/h2o/h2o-algos/0.1.8/h2o-algos-0.1.8.pom
Download http://repo1.maven.org/maven2/ai/h2o/h2o-web/0.1.8/h2o-web-0.1.8.pom
[... many more one-time downloads not shown ...]
:ideaProject
:ideaWorkspace
:idea
BUILD SUCCESSFUL
Total time: 51.429 secs

You will see three new files created with IDEA extensions. The .ipr file is the project file.

$ ls -al

total 168
drwxr-xr-x  15 tomk  staff    510 Oct 28 10:03 .
drwxr-xr-x   6 tomk  staff    204 Oct 28 08:40 ..
-rw-r--r--   1 tomk  staff    273 Oct 28 08:40 .gitignore
drwxr-xr-x   3 tomk  staff    102 Oct 28 10:03 .gradle
-rw-r--r--   1 tomk  staff   1292 Oct 28 08:40 README.md
-rw-r--r--   1 tomk  staff   1409 Oct 28 08:40 build.gradle
drwxr-xr-x   3 tomk  staff    102 Oct 28 08:40 gradle
-rw-r--r--   1 tomk  staff     23 Oct 28 08:40 gradle.properties
-rwxr-xr-x   1 tomk  staff   5080 Oct 28 08:40 gradlew
-rw-r--r--   1 tomk  staff   2404 Oct 28 08:40 gradlew.bat
-rw-r--r--   1 tomk  staff  33316 Oct 28 10:03 h2o-java-droplet.iml
-rw-r--r--   1 tomk  staff   3716 Oct 28 10:03 h2o-java-droplet.ipr
-rw-r--r--   1 tomk  staff   9299 Oct 28 10:03 h2o-java-droplet.iws
-rw-r--r--   1 tomk  staff     39 Oct 28 08:40 settings.gradle
drwxr-xr-x   4 tomk  staff    136 Oct 28 08:40 src

4. Opening the project

Since we have already created the project file, start up IDEA and choose Open Project.

1-OpenProject
Choose the h2o-java-droplet.ipr project file that we just created with gradle.

2-JavaDroplet

5. Running the test inside the project

Rebuild the project.

3-RebuildProject
Run the test by right-clicking on the test name.

4-RunTest
Watch the test pass!

5-TestPassed

6. Summary

This small example demonstrated how to create a new project with H2O as a dependency. Thanks to Michal for putting this example together! If you are working on a good example you’d like to share with the community, please send us a note or make a pull request to the h2o-droplets repository.

Tell us about this or other topics that interest you by writing to h2ostream@googlegroups.com.

Leave a Reply

+
H2O LLM DataStudio Part II: Convert Documents to QA Pairs for fine tuning of LLMs

Convert unstructured datasets to Question-answer pairs required for LLM fine-tuning and other downstream tasks with

September 22, 2023 - by Genevieve Richards, Tarique Hussain and Shivam Bansal
+
Building a Fraud Detection Model with H2O AI Cloud

In a previous article[1], we discussed how machine learning could be harnessed to mitigate fraud.

July 28, 2023 - by Asghar Ghorbani
+
A Look at the UniformRobust Method for Histogram Type

Tree-based algorithms, especially Gradient Boosting Machines (GBM's), are one of the most popular algorithms used.

July 25, 2023 - by Hannah Tillman and Megan Kurka
+
H2O LLM EvalGPT: A Comprehensive Tool for Evaluating Large Language Models

In an era where Large Language Models (LLMs) are rapidly gaining traction for diverse applications,

July 19, 2023 - by Srinivas Neppalli, Abhay Singhal and Michal Malohlava
+
Testing Large Language Model (LLM) Vulnerabilities Using Adversarial Attacks

Adversarial analysis seeks to explain a machine learning model by understanding locally what changes need

July 19, 2023 - by Kim Montgomery, Pramit Choudhary and Michal Malohlava
+
Reducing False Positives in Financial Transactions with AutoML

In an increasingly digital world, combating financial fraud is a high-stakes game. However, the systems

July 14, 2023 - by Asghar Ghorbani

Ready to see the H2O.ai platform in action?

Make data and AI deliver meaningful and significant value to your organization with our state-of-the-art AI platform.