GLM Bells and Whistles Part 2: Analysis and Results from Million Songs Data
July 15, 2013 Uncategorized [EN]Using the Million Songs Data we want to characterize a subset of the songs. To do this we’re going to run a binomial regression in H2O’s GLM. The approach to characterizing songs from the 90’s is the same method you can apply to your own data to characterize your customers relative to some larger group. In […]
GLM and K means to find Social Response Bias – Dating and Fibbers
July 12, 2013 Uncategorized [EN]In any field where data collection is dependent on what your clients, customers, public, whomever …. tell you, there’s the risk that people are big fat fibbers. This often happens because people respond they way they think they SHOULD rather than with their own personal truths. Social sciences and marketing people call this phenomenon social […]
The MillionSongs Data Part 1: Bells and Whistles of GLM in H2O
July 9, 2013 Uncategorized [EN]Using the Million Songs Data Set I want to go from beginning to end through H2O's GLM tool. Note that the original data are large, so downloading and fiddling with the full data set can be quite painful if you just do it from your desktop, that said you can find it here. It’s a good […]
Running analysis on the right data!
July 9, 2013 Uncategorized [EN]All in the day: Anqi Fu, our wickedly smart Math & Data Science hacker-intern from Stanford this summer, was characterizing GLMNet in R on sparse data and comparing with other tools. We were using a data sets predicting Two Bedroom median rent based on neighborhoods from huduser.org. DATA: http://www.huduser.org/portal/datasets/fmr/CensusRentData/index.html She found the analysis brisk and […]
Building A TB-Scale Math Platform @ Uberconf 2013, Denver
July 9, 2013 Uncategorized [EN]Building A TB-Scale Math Platform Datasets have gotten to PB-scale, but the modeling you can do has been limited to a single-node (e.g. R, SAS) or stuck inside the database or takes hours on Hadoop-like technologies. We have built a simple clustering package, and are using it to do distributed analytics on the sum of […]
Hands-on Data Science with H2O at GlobalBigDataConference
July 9, 2013 Uncategorized [EN]Experience a hands-on hack data session using H2O & R at BigDataBootCamp by GlobalBigDataConference. Every few months, Sridhar puts together a content-rich conference filled with highly engaged audience. This weekend Globalbigdataconference is doing a BigDataBootCamp – Tickets are on sale. Sri brings H2O & R to this audience, munging a couple of datasets for insight. […]
Data Science is NOT Rocket Science – H2O at Big Data Cloud
July 9, 2013 Uncategorized [EN]DJ Das brings Sri to talk about H2O by 0xdata to the Big Data Cloud Meetup July 10, 2013. Venue: 3200 Coronado Drive, Santa Clara
Age of the Intelligent Apps Ahead
July 3, 2013 Uncategorized [EN]The Age of the Intelligent Apps is here – Let's gear up.! Businesses are continuously data from.. yes, applications & sensors. Applications are the key to data creation. The future of Applications is to analyze data in-motion – learn the rules of the game at creation, backed by a super-intelligent model from historical data! A […]