Return to page Blog

Filter By:

39 results Category: Year:
H2O vs R - Winning KDDCup98 in 10 minutes with H2O
by Team | December 17, 2014

H2O is a scalable and open-source math and machine learning platform for big data. It can handle much bigger datasets and run a lot faster than R/SAS even on a single machine. How does the modeling experience with H2O differ from the experience using traditional tools such as R/SAS? This blog answers exactly this question. In particular, ...

Read more
H2O WORLD 2014 Machine Learning IS Fun.
by Team | December 03, 2014

Earlier this year I found myself sitting among 100 or so data scientists at a meetup , eating a taco and listening to how a former particle physicist found the Higgs Boson particle over a weekend using commodity hardware and open source software . Even more impressive was his ability to answer the unrelenting questions from the audience ...

Read more
What if the S language had been copyrighted?
by Team | December 01, 2014

At H2O World 2014, we were fortunate to have Josh Bloch give a reprise of his A Brief, Opinionated History of the API talk that he first delivered at SPLASH 2014 . (For those with the time, you can watch a 47 minute 21 second recording of this talk on the YouTube channel.) This is one of those subjects that I wish I could say m...

Read more
Key Takeaways from the World's Top Kagglers
by Team | November 25, 2014

Ever wondered why data science is so competitive? After a highly successful H2O World event last week, we’re shining some light on what we’ve learned from some of the world’s best data scientists and how they go about winning these data science challenges such as Kaggle . In case you missed it, we held a Competitive Data Science Panel ...

Read more
Predictive Modeling at Scale: Cisco Modernizes Predictive Model Production with H2O (joint work with Lou Carvalheira)
by Team | November 21, 2014

Cisco’s ChallengesCisco is the global leader in networking. It is a company that has long embraced the power of predictive analytics. On a regular quarter, Cisco’s Strategic Marketing Organization builds and deploys around 60,000 predictive models to treat each of 160M+ companies it maintains in its database. These models generate predict...

Read more
Introducing Flow!
by Team | November 19, 2014

After several weeks of active development, we’re proud to unveil H2O Flow, our brand new, open-source user interface for H2O! We used it live during our H2O World keynote today, and this blog post is a brief introduction to some of the core ideas behind H2O Flow.H2O Flow is a web-based interactive computational environment where you can ...

Read more
Competitive Data Science, Kaggle, Kdd and other Sports
by Team | November 16, 2014

Panelists:This panel promises to be just brilliant and full of sparks!Guocong Song Jose Guerrero Mark Landry Arno Candel

Read more
Hacking Algorithms in H2O With Cliff
by Team | November 16, 2014

Interested in Hacking Algorithms with me? I’ll be at H2 O World all day Tuesday looking to join you in doing some fun hacking. Here are 3 sample starter hacks to help you get over the H2O learning curve – Hacking KMeans Hacking Quantiles Hacking Grep All 3 take you step-by-step through the process of building a new algorithm into H2O’...

Read more
Hacking Algorithms into H2O: Grep
by Team | November 11, 2014

This is a presentation of hacking a simple algorithm into the new dev-friendly branch of H2O, h2o-dev. This is one of three “Hacking Algorithms into H2O” blogs. All of these blogs start out the same: getting the h2o-dev code and building it. They are the same until the section titled Building Our Algorithm: Copying from the Example, and ...

Read more
Hacking Algorithms into H2O: Quantiles
by Team | November 10, 2014

This is a presentation of hacking a simple algorithm into the new dev-friendlybranch of H2O, H2O 3.0. This is one of three “Hacking Algorithms into H2O” blogs. All three blogsstart out the same: getting the h2o-3 code and building it. They are the same until the section titled Building Our Algorithm: Copying from theExample, and then ...

Read more
Hacking Algorithms into H2O: KMeans
by Team | November 08, 2014

This is a presentation of hacking a simple algorithm into the new dev-friendlybranch of H2O, h2o-dev. This is one of three “Hacking Algorithms into H2O” blogs. All blogsstart out the same – getting the h2o-dev code and building it. They are thesame until the section titled Building Our Algorithm: Copying from theExample, and then the ...

Read more
Sparkling Water on YARN Example
by Team | November 01, 2014

Follow these easy steps to get your first Sparkling Water example to run on a YARN cluster. This example uses Hortonworks HDP 2.1. 1. Assumptions Installed: Java 1.7+ YARN cluster Note: In the current version of Sparkling Water running on YARN, the cluster formation requires multicast to work for the H2O nodes to find each oth...

Read more
Running Your First Droplet on H2O
by Team | October 28, 2014

A number of us were at Strata in New York City this October, and one of the major benefits of these events is getting lots of in-person time with people who use your product.Michal and Amy spent some time with a developer who was trying to build on top of the h2o-dev repo, and we realized that we didn’t have a really basic example yet of ...

Read more
Sparkling Water Tutorials
by Team | September 29, 2014

Please follow the updated version of tutorials here H2O is hosting a meetup tomorrow at our officewhere attendees are encourage to hack away with us as we run Deep Learning on Sparkling Water. If you haven’t already read allabout H2 O’s integration into Spark then get started withHow Sparkling Water Brings H2O to Spark and Sparkling W...

Read more
How to use R, H2O, and Domino for a Kaggle competition
by Team | September 23, 2014

Guest post by Jo-Fai Chow The sample project (code and data) described below is available on Domino. If you’re in a hurry, feel free to skip to: Tutorial 1: Using Domino Tutorial 2: Using H2O to Predict Soil Properties Tutorial 3: Scaling up your analysis IntroductionThis blog post is the sequel to TTTAR1 a.k.a. An Introduction t...

Read more
How Sparkling Water Brings H2O to Spark
by Team | September 22, 2014

This post provides a high-level introduction to the current integration plan between H2 O and Spark. This is an ongoing engineering effort involving collaboration between the open source teams, and describes what is currently underway.1. Overall ApproachThe first question one might ask is “Why”? What does one, as a user, gain from trying ...

Read more
Sparkling Water!
by Team | September 05, 2014

H2O & Scala & SparkSpark is an up and coming new big data technology; it’s a whole lot faster andeasier than existing Hadoop-based solutions. H2 O does state-of-the-art MachineLearning algorithms over Big Data – and does them Fast. We are happy toannounce that H2 O now has a basic integration with Spark – Sparkling Water! This is...

Read more
Introducing H2O Lagrange ( to R
by Team | August 26, 2014

From my perspective the most important event that happened atuseR! 2014 was that I got to meetthe 0xdata team and now, long story short,here I am introducing the latest version of H2 O, labeledLagrange ( ,to the R and greater data science communities. Beforejoining 0xdata, I was working at a competitor on a rival project and w...

Read more
useR! 2014
by Team | July 15, 2014

Two weeks ago we attended the useR! conference hosted on the UCLA campus. I landed in Los Angeles at 8:30 P.M on Sunday June 29, and met up with Amy — another math hacker at 0xdata. After a harrowing cab ride we arrived on the UCLA campus at Sunset Village where we would be lodging for the next 3 evenings. Having just got the h2o R packag...

Read more
Learn to manage, munge, and model big data with H2O on the Hortonworks Sandbox
by Team | June 26, 2014

Working with big data might seem like a daunting task if like me, you’ve spent the majority of your college years doing pencil and paper proofs. Big data for me was anything that took longer than 30 minutes to ingest into single threaded R. For mathematicians and statisticians looking to understand widely used data platforms like Hadoop f...

Read more
H2O - The Killer-App on Spark
by Team | June 25, 2014

object AirlinesDemo extends Demo { override def run(conf: DemoConf): Unit = { // Prepare data // Dataset val dataset = “data/allyears2k_headers.csv” // Row parser val rowParser = AirlinesParser // Table name for SQL val tableName = “airlines_table” // Select all flights with destination == SFO val query = “””SELECT * FROM airlin...

Read more
A K/V Store For In-Memory Analytics, Part 2
by Team | May 23, 2014

This is a continuation of a prior blog on the H2O K/V Store, Part 1. A quick review on key bits going into this next blog: H2O supports a very high performance in-memory Distributed K/V store The store honors the full Java Memory Model with exact consistency by default Keys can be cached locally for both reads & writes A typi...

Read more
SJSU Tutorial on H2O and Random Forest
by Team | April 25, 2014

Our friends over at SJSU added this post to their course website after the H2O team stopped by earlier this semester to talk about H2O. We’ve reposted it here, but you can find the original at: Oxdata (H2O) TutorialPosted on April 24, 2014 by bigsjsu Oxdata (H2O) Tutori...

Read more
Tableau: Math Hacker Amy Talks Big Data Visualization TONIGHT
by Team | April 17, 2014

Anqi and I are back from NY, and we brought Amy with us – she's incredible, and she's giving a presentation at our meet up tonight, where she will talk about Big Data, visualization, and presenting interpretable graphics. So we're looking forward to seeing you tonight – the details are here:#meetup_oembed .mu_clearfix:after { visibility...

Read more
MLConf NY - Friday, April 11: Demo of Workflow and Collective Use Case
by Team | April 07, 2014

This Friday H2O will be at MLconf ( to give a live demo, introduce a customer use case, and talk about the implications of model specification in production. If you don’t get a chance to stop by our booth, or come see our demo, you can find the presentation slides on the MLconf website (they will be posted on Friday, Apr...

Read more
Google-scale Machine Learning & Deep Learning gets principal platform in Apache Mahout with Spark and H2O
by Team | March 27, 2014

H2O’s vision is direct and simple: scaling machine learning for powering intelligent applications. Our focus is distributed machine learning and a fully-featured set of industrial grade algorithms. Apache Mahout is where people learn their chops in Machine Learning. Like R, It’s the “hello world” first place many new users get exposed to ...

Read more
Hang out with us tomorrow- Mar 26: H2O Math Hackers Present: Model Specification
by Team | March 26, 2014

Anqi and Irene present a hack along preview of their upcoming talk at MLConf. Come join us as we talk about the implications of model specification, and walk through how to frame models when asking different questions of the same data. #meetup_oembed .mu_clearfix:after { visibility: hidden; display: block; font-size: 0; content: " "; cl...

Read more
Meetup TONIGHT - Arno Presents: Deep Learning: Theory and Practice!
by Team | March 26, 2014

If you were unable to join us on Thursday 3/21 because of the high volume of interest, we are offering the same meeting again!In this talk, Arno Candel, Physicist & Hacker at will breakdown the basics of deep learning in theory & present implementation, early results from using MLP with Adaptive learning as implemented in...

Read more
In-memory Big Data: Spark + H2O
by Team | March 25, 2014

Big Data has moved in-memory. Customers using SQL in their Join & Munging efforts via SHARK and Apache Spark need to use Regressions and Deep Learning. To make their experiences great & seamlessly weave SQL workflows with Data Science and Machine Learning, we are architecting a simple RDD data import-export in H2O. This brings c...

Read more
Data Munging in H2O+R
by Team | March 24, 2014

Over the weekend we fielded a question from one of our users about the basics of data munging in H2O through R – and it was a good question, so I wanted to share the response with a wider audience – namely you guys.There are a few quick things about data munging in H2O+R: – It often looks and feels like you are manipulating data in R; we...

Read more
H2O Architecture
by Team | March 20, 2014

This is a top-level overview of the H2O architecture. H2O does in-memory analytics on clusters with distributed parallelized state-of-the-art Machine Learning algorithms . However, the platform is very generic, and very very fast. We’re building Machine Learning tools with it, because we think they’re cool and interesting, but the plat...

Read more
H2O at Code Mesh - API for in-memory Analytics - Cliff
by Team | February 25, 2014

Video link here:API for in-Memory Analytics – CodeMesh ...

Read more
Hanging out at ShareThis
by Team | February 24, 2014 We spent some time with the engineers and data scientists at ShareThis last week, and had a great time learning about their use cases, and getting H2O running on their data. It's nice to know that the ShareThis team had ...

Read more
Generate A Mandelbrot Set In H2O
by Team | February 15, 2014

Roses are red, Violets are ~ Blue, H2O is sweet, And fractals are too! $$z_n = z_{n-1}^P + c$$ Where c is a “candidate” complex number. (Typically you’ll see $$P = 2$$ — that’s what we’ll do too). We set the the size of the sequence to the number of iterations we want, and measure convergence by looking at the modulus of $$z_n$$ ...

Read more
And you know, we're on each other's team - Lorde
by Team | February 15, 2014

Walking past giant anti-burner consumerist strata booths, i was struck by Lorde's recent masterpiece. The Big Data Palace needs a release. No hype, it needs product. Product is the release. The emperor has no clothes and no one seems to dare. You see the propaganda machine. Working lock-step to Strata / stage setters. Darling startups tha...

Read more
A K/V Store For In-Memory Analytics: Part 1
by Team | February 06, 2014 is building in-memory analytics (no surprise, see . What may be a surprise, though, is that there’s a full-fledged high-performance Key/Value store built into H2O and that is central to both our data management and our control logic. We use the K/V store in two main ways: All the Big Data is stored striped acros...

Read more
I'll let you be in my model, if I can be in yours.
by Team | February 05, 2014

Bob Dylan* said that.User-centric modeling is here to stay. Rich insights are available when we combine, knowledge of the world with knowledge of your customer. Yes, one at a time. However, users tangle in a network of events and overlap & become part of each others models. Sensor data can avoid granularity mismatch by building models...

Read more
Come visit H2O at Strata Booth 919
by Team | February 03, 2014

Greetings H2O friends and fans! Let’s do the data dance at Strata Santa Clara, Feb. 11-13 and check out our latest H2O Prediction Engine demo. We will be exhibiting at booth 919 and offering a 20% discount off registration. The show is slated to sell out, so be sure to register today and get your 20% discount with our code: 0XDATA20 ,...

Read more
Hack data with our resident data scientist, Earl
by Team | January 30, 2014

This last thursday of every month event: Hack data with Earl Hathaway – our resident data scientist.#meetup_oembed .mu_clearfix:after { visibility: hidden; display: block; font-size: 0; content: " "; clear: both; height: 0; }* html #meetup_oembed .mu_clearfix, *:first-child+html #meetup_oembed .mu_clearfix { zoom: 1; }#meetup_oembed { bac...

Read more