From my perspective the most important event that happened atuseR! 2014 was that I got to meetthe 0xdata team and now, long story short,here I am introducing the latest version of H2 O, labeledLagrange (188.8.131.52) ,to the R and greater data science communities. Beforejoining 0xdata, I was working at a competitor on a rival project and wasrepeatedly asked why my generalized linear model analytic didn’t run as fast asH2 O’s GLM. The answer then as it is now is the same – becauseH2 O has a cutting edge distributed in-memory parallel computingarchitecture – but I no longer receive an electric shock every time I say so.
For those hearing about H2 O for the first time, it is an open-sourcedistributed in-memory data analysis tool designed for extremely large data setsand the H2 O Lagrange (184.108.40.206) release provides scalable solutionsfor the followinganalysis techniques :
In my first blog post at 0xdata, I wanted to keep it simple and make sure Rusers know how to get the
h2o package, which is cross-referenced on theHigh-Performance and Parallel Computing andMachine and Statistical Learning CRAN Task Views , up and running on theircomputers. To so do, open an R console of your choice and type
# Download, install, and initialize the H2O package
repos = c("http://h2o-release.s3.amazonaws.com/h2o/rel-lagrange/11/R", getOption("repos")))
localH2O <- h2o.init()
# List and run some demos to see H2O at work
demo(package = "h2o")
After you are done experimenting with the demos in R, you can open up a webbrowser to http://localhost:54321/ to give the H2 O web interface aonce over and then hop over to0xdata’s YouTube channel for somein-depth talks.
Over the coming weeks we at 0xdata will continue toblog about how to use H2 Othrough R and other interfaces. If there is a particular use case you would liketo see addressed, join ourh2ostream Google Groupsconversation or e-mail us at email@example.com . Until then, happy analyzing.