How to Build a Machine Learning App Using Sparkling Water and Apache Spark
October 3, 2015 UncategorizedThe Sparkling Water project is nearing its one-year anniversary, which means Michal Malohlava, our main contributor, has been very busy for the better part of this past year. The Sparkling Water project combines H2O machine-learning algorithms with the execution power of Apache Spark. This means that the project is heavily dependent on two of the […]
How I used H2O to crunch through a bank's customer data
September 20, 2015 UncategorizedThis entry was originally posted here Six months back I gingerly started exploring a few data science courses. After having successfully completed some of the courses I was restless. I wanted to try my data hacking skills on some real data (read kaggle). I find competing in hackathons, helps you to benchmark yourself against your […]
Fast, Scalable Machine Learning- Now with New and Improved Python API
September 4, 2015 UncategorizedH2O now has a new Python API, based on valuable feedback provided by our community. Newest features include: – pandas-like dataframes, but for large, distributed computing – scikit learn integration – machine learning pipeline API Check out the tutorial below:
The Definitive Performance Tuning Guide for H2O Deep Learning (Ported scripts to H2O-3, results are taken from February’s blog)
August 28, 2015 UncategorizedIntroduction This document gives guidelines for performance tuning of H2O Deep Learning, both in terms of speed and accuracy. It is intended for existing users of H2O Deep Learning (which is easy to change if you’re not), as it assumes some familiarity with the parameters and use cases. Motivation This effort was in part motivated […]
An Introduction to Data Science: Meetup Summary Guest Post by Zen Kishimoto
August 28, 2015 UncategorizedOriginally posted on Tek-Tips forums by Zen here I went to two meetups at H2O, which provides an open source predictive analytics platform. The second meetup was full of participants because its theme was an introduction to data science. Data science is a new buzzword, and I feel like everyone claims to be a data […]
Lending Club : Predict Bad Loans to Minimize Loss to Defaulted Accounts
August 3, 2015 UncategorizedAs a sales engineer on the H2O.ai team I get asked a lot about the value add of H2O. How do you put a price tag on something that is open source? This typically revolves around the use cases; if a use case pertains to improving user experience or making apps that can improve internal […]
Introduction to Data Science using H2O – Chicago
August 3, 2015 UncategorizedThank you to Chicago for the great meetup on 29 July 2015. Slides have been posted on GitHub. The links to the sample scripts and data is contained in the slides. If you have any further questions about H2O, please join our GoogleGroup or chat with us on Gitter . The slides are also available […]
useR! Aalborg 2015 conference
July 16, 2015 UncategorizedThe H2O team spent most of the useR! Aalborg 2015 conference at the booth giving demos and discussing H2O. Amy had a 16 node EC2 cluster running with 8 cores per node, making a total of 128 CPUs. The demo consisted of loading large files in parallel and then running our distributed machine learning algos […]
KFold Cross Validation With H2O-3 and R
July 9, 2015 UncategorizedThis blog is also explains the solution to a Google Stream question we received Note: KFold Cross Validation will be added to H2O-3 as an argument soon This is a terse guide to building KFold cross-validated models with H2O using the R interface. There's not very much R code needed to get up and running, […]