June 7th, 2013

Chocolate Cake

RSS icon RSS Category: Uncategorized [EN]
Fallback Featured Image

Chocolate Cake (Wednesday, June 5, 2013) 

You know how sometimes you have one bite of really good chocolate cake, or a really amazing peach and totally assume that you could eat another 30lbs of whatever without regard for good manners or physical limitations?  Yeah. Decreasing marginal returns dictate that it almost always turns out that the last bite isn’t as good as the first one – having a little and having a lot are different.

Similarly, ingesting 1000 bytes of data and 1 byte are pretty different, and when you’re used to little bytes and start fooling around with the big ones the differences might not be immediately obvious or intuitive (maybe they are, and if that’s the case – awesome! Now go eat your cake).

When I’m trying to make sense of a problem I like to start with a small example and work through it to get a feel for the mechanics. With Big Data this gets a little weird, since we’re almost always mining, so we don’t always know well what to look for or expect, and because we need some intuition for how to get from the small to the big.  To help that, I am trying to build some intuitive explanations.  You can look at them topically under the posts beginning with header “Big vs. Little…”

Sometimes it is the case that using H2O to look at small data sets really makes no sense for whatever reason. In those cases we’ll talk about why, and I’ll use R for comparison. I’ll also provide you with relevant output for each (so that you can see how to get from one to the other). If you’re not familiar with R go here.  Additionally, it’s worth mentioning that I’m tackling one set of assumptions at a time, so in general I’ll work as though we are going through some ad-hoc analysis instead of post-hoc analysis.  There are some super cool differences between mucking vs. mining, but I want to talk about those separately.

Leave a Reply

+
Recap of H2O World India 2023: Advancements in AI and Insights from Industry Leaders

On April 19th, the H2O World  made its debut in India, marking yet another milestone

May 29, 2023 - by Parul Pandey
+
Enhancing H2O Model Validation App with h2oGPT Integration

As machine learning practitioners, we’re always on the lookout for innovative ways to streamline and

May 17, 2023 - by Parul Pandey
+
Building a Manufacturing Product Defect Classification Model and Application using H2O Hydrogen Torch, H2O MLOps, and H2O Wave

Primary Authors: Nishaanthini Gnanavel and Genevieve Richards Effective product quality control is of utmost importance in

May 15, 2023 - by Shivam Bansal
AI for Good hackathon
+
Insights from AI for Good Hackathon: Using Machine Learning to Tackle Pollution

At H2O.ai, we believe technology can be a force for good, and we're committed to

May 10, 2023 - by Parul Pandey and Shivam Bansal
H2O democratizing LLMs
+
Democratization of LLMs

Every organization needs to own its GPT as simply as we need to own our

May 8, 2023 - by Sri Ambati
h2oGPT blog header
+
Building the World’s Best Open-Source Large Language Model: H2O.ai’s Journey

At H2O.ai, we pride ourselves on developing world-class Machine Learning, Deep Learning, and AI platforms.

May 3, 2023 - by Arno Candel

Request a Demo

Explore how to Make, Operate and Innovate with the H2O AI Cloud today

Learn More