Gradient Boosting Machine in III Acts: Trevor Hastie, Netflix & 0xdata
September 25, 2013 Uncategorized [EN]Gradient Boosting Machine in III Acts: Dr. Trevor Hastie, Netflix & 0xdata. Triple Header on Boosting & GBM: Act I: Trevor Hastie, Of Stanford Mathematical Sciences, the mathematician behind Lasso & GBM speaks of the nuances of the Algorithm. Act II: Cliff Click, CTO of 0xdata, the implementor of parallel and distributed GBM. Act III: […]
Even More MNIST
September 11, 2013 Uncategorized [EN]Since we've been fooling around with the MNIST data set quite a bit lately (Spence is using it in benchmarking), I've been following the leaderboard and methods for the ongoing Kaggle competition around the same data. It's really amazing to see what people come up with. But of course, the purpose of H2O is entirely […]
Replay: Modeling MNIST With RF Hands-on Demo
September 5, 2013 Uncategorized [EN]Last week Spencer put together a great hands on for modeling data using H2O (http://www.meetup.com/H2Omeetup/). This post is a write-up of the workflow for generating an RF model on MNIST data for those of you who want to walk through the demo again, or maybe missed the live action version. I’m running through one of […]
Hands on Workshop: Hack Data With Math
August 28, 2013 Uncategorized [EN]Thursday night (August 29) at 7, resident math hacker Spencer A. is leading a hands on workshop on using H2O to analyze real-world data. For those of you who are new to the math side of H2O, we have notes below to help you get prepared. H2O is a distributed math platform featuring a set […]
Big Data Science in H2O with R
August 21, 2013 Uncategorized [EN]Big Data Science with H2O in R from Anqi Fu We had a great turnout at our Meetup last night! We took a look at the H2O/R API, then dove right in to a hands-on demo, where we imported, cleaned, and ran GLM on the airlines data set in H2O using R commands. Here are […]
Public Data Sets
August 16, 2013 Uncategorized [EN]For your data analysis pleasure, I give you a giant list of super cool publicly available data. If you’re looking at the data sets and wondering “now what?” – you can find this list AND tutorials on how to use H2O for analysis at the H2O docs page (here: http://docs.0xdata.com). You can also get a detailed […]
TCP Is Not Reliable
August 16, 2013 Uncategorized [EN]Been to long between blogs… “TCP Is Not Reliable” – what's THAT mean? Means: I can cause TCP to reliably fail in under 5 mins, on at least 2 different modern Linux variants and on modern hardware, both in our datacenter (no hypervisor) and on EC2. What does “fail” mean? Means the client will open […]