
By: H2O.ai
Since we've been fooling around with the MNIST data set quite a bit lately (Spence is using it in benchmarking), I've been following the leaderboard and methods for the ongoing Kaggle competition around the same data. It's really amazing to see what people come up with. But of course, the purpose of H2O is entirely that one need not devote days on end to finding clever tricks with limited generalizability to other predictive problems. So while Spence benchmarks speed, I made short work of a submission that let H2O do all of the heavy lifting, to get a sense of how we're doing on predictive power for real. In the time it took me to grab take-out and coffee H2O built a a model of 500 trees (on our server with 20g memory. It would take a bit longer on my computer with an allocation of less memory), generating a submission with less than 5% prediction error – all in all, not a bad lunch break.