Return to page

H2O Sparkling Water

The Best Machine Learning on Spark

Overview

The Best of Both Worlds with H2O and Spark

Sparkling Water allows users to combine the fast, scalable machine learning algorithms of H2O with the capabilities of Spark. Spark is an elegant and powerful general-purpose, open-source, in-memory platform with tremendous momentum. H2O is an in-memory platform for machine learning that is reshaping how people apply math and predictive analytics to their business problems. Integrating these two open-source environments provides a seamless experience for users who want to make a query using Spark SQL, feed the results into H2O to build a model and make predictions, and then use the results again in Spark. For any given problem, better interoperability between tools provides a better experience.

Key Features of Sparkling Water

  • Access to H2O Algorithms
  • Drive Computation from Scala, R and more
  • Simple Deployment

Access to H2O algorithms developed from the ground up for distributed computing and for both supervised and unsupervised approaches including Random Forest, GLM, GBM, XGBoost, GLRM, Word2Vec and many more.

Drive computation from Scala, R, or Python and use the H2O Flow UI, providing an ideal machine learning platform for application developers.

Easy to deploy POJOs and MOJOs to deploy models for fast and accurate scoring in any environment, including very large models.

How it Works

Distributed, In-Memory Machine Learning

Sparkling Water is designed to be executed as a regular Spark application. It provides a way to initialize H2O services on Spark and access data stored in data structures of Spark and H2O.

Advanced Machine Learning for Spark

Use the best algorithms for distributed in-memory computing with your existing Spark implementation.

Deploy results in Spark

Results from H2O can easily be deployed using H2O low-latency pipelines or within Spark for scoring.

sparkling water architecture sparkling water architecture

Enterprise Support

When AI becomes mission critical for enterprise success, H2O.ai is there to help. H2O Enterprise Support provides the services you need to optimize your investments in people and technology to deliver on your AI vision. H2O Enterprise Support includes training, a dedicated account manager, 24/7 support, accelerated issue resolution and direct enhancement requests. Enterprise support also gives you access to H2O experts in data science, the H2O platform and DevOps/production deployment to accelerate and expand your adoption of AI.

Enterprise Support Enterprise Support

Related Case Study

Ben Teeuwen,
Senior Data Scientist, Booking.com

"We use H2O and Spark for our bigger models. It is easy for us to deploy models with H2O."

Related Resources & Blogs

Sparkling Water Download Options

Sparkling Water combines the fast, scalable machine learning algorithms of H2O with the capabilities of Spark.

Cloud Downloads