Earlier this year I found myself sitting among 100 or so data scientists at a meetup , eating a taco and listening to how a former particle physicist found the Higgs Boson particle over a weekend using commodity hardware and open source software . Even more impressive was his ability to answer the unrelenting questions from the audience that came before, during and after the talk. Unfazed, the H2O team handled each question honestly and accurately without skipping a beat. Fast forward a few months later and a few hundred people more and I found myself in a very similar situation at H2O World at the Computer History Museum. This time, with not one speaker and one topic but the entire team offering a full day crash course into machine learning with over 300 data scientists ready to power the next era of machine learning for smarter business applications.
“How did you manage to get the venue to use your company colors” was a question that was frequently asked and in response I simply stated that all of this was meant to happen, but the truth is much more interesting and I’ll leave that for another blog. As you move through the museum you quickly realize what is on display is not the groundbreaking technology, but really the tools to enable people to share and leverage information in many different applications. By having the first H2O World conference in the Computer History Museum the full weight of information management history that has preceded us was on display. It was a clear reminder of how far we have really come from the age of computers taking up full buildings to now living in the cloud or in the comfort of your laptop. In the era of machine learning, open source, and distributed computing those who are working together to solve hard problems out in the open and not in isolation are leading the way.
“It’s not what you know, it’s what you can prove,” is a mantra from Denzel Washington in the movie Training Day. Much like his mantra, data science and the tools that power them are nothing until you prove it. To this end, Machine learning and Data Science is hard. For this reason, we dedicated an entire day to ease people of all backgrounds into the field and quickly get them training models by the end of the day. We created a comprehensive program with an easy to follow manual you can see at learn.h2o.ai . In it, we covered, how to create your own sandbox , setting up big data environments , R with H2O , Supervised Learning , Unsupervised Learning , and a number of advanced topics. Not only was H2O Training available for our attendees, we also live streamed the event and recorded for future reference on our H2O World video channel. Over 3000 people took the training since launch and continuing to do so at a surprising rate.
“Having sat through many many presentations in my career, I have never experienced anything like this,” was whispered to me as Nachum Shacham described how H2O came to the rescue at Paypal.
We actually saw two presentations from Paypal, one from Nachum Shacham and the other from Venkatesh Ramanathan. Both talks highlighted challenges of conducting big data science with distributed systems and complex machine learning algorithms and how they were able to overcome them with H2O. This was a common theme as we also heard from Lou Carvalheira of Cisco describe their “Model Factory” and Hassan Namarvar of ShareThis describe their Ad Platform along with many of our customers. In addition to customer stories, we also heard from a number of the community members ranging from notable researchers to distinguished professors to independent contributors and finally strategic investors. The complete presentations are now available to view.
Showcasing an all-star cast of machine learning and artificial intelligence experts and renowned data scientists, H2O World brings together its broad Open Source community to explore how the world’s most innovative companies are building their own digital brains. From building predictive models and run machine learning algorithms in Big Data environments, the event provides core insights into practical data science applications, including use models in adtech, transportation, marketing research, fraud detection, behavioral analytics, and investment trends in AI.