July 9th, 2013

Building A TB-Scale Math Platform @ Uberconf 2013, Denver

RSS icon RSS Category: Uncategorized [EN]
Fallback Featured Image


Building A TB-Scale Math Platform
Datasets have gotten to PB-scale, but the modeling you can do has been limited to a single-node (e.g. R, SAS) or stuck inside the database or takes hours on Hadoop-like technologies. We have built a simple clustering package, and are using it to do distributed analytics on the sum of all ram in a cluster.

This talk focuses on how the clustering technology, plus a Java-based vector math API, is being used to build full algorithms like GLM/GLMNET, Random Forest and K-means. These algorithms are complex multi-pass programs and traditional distributed programming models expose the distributed boundaries making the algorithms hard to reason about. We have a basic JDK for doing at-scale math, we can run most Plain Olde Java in (distributed) inner loops, communicate via a K/V store with exact Java Memory Model consistency (not lazy consistency). Adding more cpus makes these algorithms run faster, and adding more ram allows larger datasets. We are bringing back Moore’s Law!
Cliff will be presenting.

Leave a Reply

+
Enhancing H2O Model Validation App with h2oGPT Integration

As machine learning practitioners, we’re always on the lookout for innovative ways to streamline and

May 17, 2023 - by Parul Pandey
+
Building a Manufacturing Product Defect Classification Model and Application using H2O Hydrogen Torch, H2O MLOps, and H2O Wave

Primary Authors: Nishaanthini Gnanavel and Genevieve Richards Effective product quality control is of utmost importance in

May 15, 2023 - by Shivam Bansal
AI for Good hackathon
+
Insights from AI for Good Hackathon: Using Machine Learning to Tackle Pollution

At H2O.ai, we believe technology can be a force for good, and we're committed to

May 10, 2023 - by Parul Pandey and Shivam Bansal
H2O democratizing LLMs
+
Democratization of LLMs

Every organization needs to own its GPT as simply as we need to own our

May 8, 2023 - by Sri Ambati
h2oGPT blog header
+
Building the World’s Best Open-Source Large Language Model: H2O.ai’s Journey

At H2O.ai, we pride ourselves on developing world-class Machine Learning, Deep Learning, and AI platforms.

May 3, 2023 - by Arno Candel
LLM blog header
+
Effortless Fine-Tuning of Large Language Models with Open-Source H2O LLM Studio

While the pace at which Large Language Models (LLMs) have been driving breakthroughs is remarkable,

May 1, 2023 - by Parul Pandey

Request a Demo

Explore how to Make, Operate and Innovate with the H2O AI Cloud today

Learn More