Return to page

BLOG

Empowering Snowflake Users with AI using SQL

 headshot

By Vinod Iyengar | minute read | October 12, 2020

Blog decorative banner image

At H2O.ai we work with many enterprise customers, all the way from Fortune 500 giants to small startups. What we heard from all these customers as they embark on their data science  and machine learning journey is the need to capture and manage more data cost-effectively, and the ability to share that data across their organization to make better business decisions. The cloud provides many benefits to build a data platform, but the danger of vendor lock-in always lurks in the corner. That’s why many customers are looking to Snowflake as their data platform, so they can use their choice of cloud provider for their data strategy. The same is true when customers are looking to select the best automatic machine learning technology. Having the flexibility to choose the cloud infrastructure on which to run data science workloads provides customers the flexibility of using best of breed solutions that give them a competitive edge with cloud-neutral, innovative technology platforms.

Making AI Accessible to Snowflake SQL Users

The challenge for many companies is how to extract more value from the data they capture and store in the Snowflake Data Cloud. Data science and machine learning is a great way to provide predictive insights from data to make better business decisions. Companies are highly dependent on data scientists for extracting new predictive insights from the data they have. The implementation of the entire process tends to be difficult, tedious and requires a number of different skilled resources. It’s not only the data scientist that is a key player in that process, but also other functions such as data engineers and analysts that are very familiar with SQL for querying data. Making AI and ML available to these users in their familiar SQL environment opens up a range of new possibilities to accelerate the adoption of AI. This is why H2O.ai worked closely with Snowflake to bring the power of Driverless AI  at the fingertips of Snowflake users.

Figure 1: Using SQL in Snowflake for machine learning (Click on the image to watch full walkthrough video)

Removing Barriers to Deploying Models in Production

Organizations depend on data ops people, as well as data engineers to extract the business value from the models that data scientists are building. The whole idea behind the integration of H2O Driverless AI with Snowflake is to streamline that end to end machine learning process, from right at the start of developing machine learning models all the way to putting those models into production and scoring new data that is being captured about customers.

Figure 2: Streamlining the ML pipeline process

The question is how much can we automate the model development process within the ML platform? With Driverless AI it’s all about automation of data science and machine learning tasks that can speed up the creation of highly accurate models.  Once the model is built, then it needs to go into production where it will actually generate business value. And so the whole process from model development to model deployment introduces complicated tasks where different resources come into the picture in addition to data scientists. Data engineers or data ops people have the responsibility to take those models and ensure they can be operationalized in a production environment.

Using Driverless AI from Within Snowflake

Let’s first talk about the common process of model building and deployment with data in a Snowflake environment.  The data scientist would use the Driverless AI GUI to train a model with data imported using the Snowflake connector. That model was then deployed in a scoring engine for production use.  To make predictions on new data, you had to export that data into a .csv file (or any other file format) and push it into the scoring engine. Then the predictions made in the scoring engine have to be written back into the Snowflake environment. So even though this might seem simple and straightforward, it is a tedious and cumbersome process to set up and manage. In addition, this batch process does not lend itself to real-time scoring on fresh new data for AI-enabled applications that need in-the-moment predictions.

With Snowflake introducing external functions  earlier this year, H2O got an opportunity to make this whole process much more efficient. By using external functions we can make Driverless AI available as a remote service to users from within Snowflake. Driverless AI can be invoked from within Snowflake to train or retrain a model, automatically deploy it as a REST server, and make it available to score new data. All this is executed by using familiar SQL statements and commands to score the data from within Snowflake. With the use of external functions, there is no longer the need for exporting data from Snowflake to score data.  By calling the function in SQL using the Snowflake user interface it is now possible to update tables with predictions directly in Snowflake.

Figure 3: Using external functions to make predictions in Snowflake

The integration of H2O Driverless AI with Snowflake using external functions makes automatic machine learning available at the fingertips of every Snowflake user, including data engineers and data analysts.  They no longer need to learn a new technology platform to use the full power of ML to extract meaningful insights from their data. This results in a more efficient, flexible and cost-effective machine learning process that will accelerate the adoption of AI.

To know more, visit our Snowflake page at: https://www.h2o.ai/partner/snowflake/ 

 headshot

Vinod Iyengar, VP of Products

Vinod Iyengar is the Vice President of Product at H2O.ai. He leads a team charged with product management and product development across the H2O.ai platform. Vinod has worked for H2O.ai since 2015. In his time with the company, he has worked as the VP of marketing & technical alliances, and VP of customer success & product. Vinod received his bachelor’s degree in engineering from the University of Mumbai and his master’s degree in quantitative analysis from the University of Cincinnati College of Business.

 headshot

Yves Laurent

Yves has over 20 years of experience in building partner and channel go to market strategies for leading technology companies. He started his career at Cisco Systems where he held various sales and marketing leadership positions across EMEA, APAC and US.  Before joining H2O he lead partner marketing at Denodo and Hortonworks where his focus has been on ensuring partner success through partner programs that align with business objectives. During his spare time he enjoys the outdoors with his family and friends.