Overview of H2O.ai
The H2O AI Cloud is a state-of-the-art artificial intelligence (AI) cloud platform that enables data scientists, analysts, and developers to easily make, operate and innovate with AI in order to solve business problems. In this webinar, we'll cover each of our cloud products and why data scientists and analysts are using our AI Cloud to speed up experimentation, tackle deep learning programs, and remove operational constraints that take time away from making AI.
2:35 Who is H2O.ai
5:44 Why AI Clouds
11:46 What is the H2O AI Cloud
13:06 H2O-3 (Open Source Distributed Machine Learning)
15:10 H2O Driverless AI (Award-Winning AutoML)
18:08 H2O Hydrogen Torch (No-Code Deep Learning)
21:01 H2O Document AI (Intelligent Document Models)
24:46 H2O Wave (Low-Code AI AppDev Framework)
25:41 H2O Wave Demo
29:22 H2O AI Feature Store
32:34 H2O MLOps (Model Registry, Deployment, Scoring and Monitoring)
35:17 H2O AI AppStore
36:40 Flexible Infrastructure, Fully Managed Cloud and Hybrid Cloud
39:09 Snowflake Integration
42:41 H2O AI Cloud Demo
Welcome everybody to the Make with H2O.ai overall webcast program. This is our first one we’re rolling out and we're doing H2O AI Cloud today. I’m Read Maloney, I run marketing for H2O. Vinod, my illustrious colleague is going to introduce himself.
Hey, this is Vinod, Vice President of products at H2O, excited to be here and got this new series.
Absolutely, we got we got a ton for you. Today, we're gonna dive in. If you know us really well, from the open source or using our AI cloud, we have a lot of new innovation to share. We're gonna be running through everything that's available in the H2O AI Cloud as part of the overview today. And Vinod’s got a couple of demos planned, that we'll be hitting along the way.
We also want to get a lot of feedback, not just from this session, but from the program in general, from our broader community that's out there. So we have a few polls, we asked you to participate, it'd be great for us to keep getting information, you know, we want to provide the information that you need. We want to be providing the platform and the functionality and the features that you need to be really successful in your day to day. And so that's another big part of the goal of this overall program is not just education, but listening and getting that feedback back. As part of that. There is both a chat and a q&a window overall, you know, if you can put your questions in the q&a, it helps us track whether we've answered them or brought them to you. And we do plan on leaving some time at the end of the cast today to be able to answer those questions. All right, so with that we're going to start with the first poll here that should pop up on your screen in just a moment. There you go. So we can just quickly answer what your favorite products look like, give us a sense of what you know out there ,and what we're going to drive into. And then we also have some sessions following up on this one that will go deeper into some of the some of the specific products and some overall technology topics. Give everyone just another minute here. All right.
We definitely have a mix going on between Driverless. H2O3 is sort of the top. There's Torch, they're awesome. Good to see that usage there. Let's move on. This one okay, great.
So H2O has really a community powered company, we have over 200,000 community members in companies using H2O. And that's led to a really strong enterprise adoption, where we have over 200 of the Fortune 500 using H2O on a day to day basis. A lot of our growth has been driven by us interacting with the community, listening to data scientists, hearing what they need, and then layering in some of the top data science talent to help us on the features or products that we're developing. And so we have some of the top world's Kaggle grandmasters, and we use that expertise within the products that we build. The recipes, the application, some of these things we're going to be talking about today, that go into having really optimized solutions that can be used to accelerate on a on a whole set of use cases.
And so we have use cases across multiple industries that we've worked on, we've worked on hundreds and thousands of use cases across financial services, healthcare and life sciences, telecom, marketing, and retail, with customers like AT&T. If you're watching on the videos at the start, also things like CBA on financial services, and Wells Fargo. And so we've done things from preventative maintenance, all the way to fraud, and then all the way back on the marketing side to how do you do spend optimization and personalization as you bring all that together.
And so sure, regardless of what the industry or use case is, across the set of products that we have available with our AI cloud, we’ve been able to go and jump in and help with our customers and have the technology to help them very rapidly solve these really important business problems to drive value. And so when we think about value within those use cases, we have customers that that made over $100 million a year or more by just increased trading profits in terms of how they're doing bond trading. In terms of insurance, underwriting $20 million in annual savings, iPhone fraud, nearly a billion dollars of savings, collusion fraud, over a billion dollars of savings. These are huge numbers, preventative maintenance, where they've taken their regular scheduled maintenance that they would have needed to have done and dropped that by 30% in their overall costs. But also some of the main elements is service to their customers. So they're able to increase the service to their customers by being able to predict maintenance before it happens.
Targeted marketing campaigns are really going after the right audience at the right time with the right campaign. And when this is multiplied by some of the largest budgets in the world, and some of the some of the largest digital media groups, this comes together and has just huge impact. So Medical Referrals is another one you might have heard UCSF talk about. We'll talk about that a little bit more today. But really across these industries and use cases, what we're really focused on is not just AI for the sake of AI but AI to drive business value, whether you're a government and nonprofit organization. It's really to drive mission and mission accomplishment. Right. So why AI Cloud? Why is there this investment? Why are we talking about AI Cloud right now?
There's a few main reasons. One of which is the age of AI is now. 85% of executives say they won't achieve their growth objectives without scaling AI 71% in the state 75% globally say they think they risk going out of business without being able to scale AI in the next five years. And so when we look at AI Cloud, they're there, what we're finding is they're helping customers move past the sticking point where they get stuck in the pilot phase. And the pilot phase, we have our own AI maturity model that you're looking at here. And our end phase is where we're helping our customers get to. We have many successful customers who are really deep into their overall phases of AI maturity. It makes them an AI customer, where they're actually creating new revenue streams lines of business off of AI because they've gone so far through that journey. But most of the market is still stuck in the pilot phase. And this happens for a whole variety of reasons.
And one of the ones that has impacted me, or that I've been a part of the most is when I get a project, obviously on the business side when I get a project and they say hey, here's the score. Here's something that's predicting x and I say, Okay, well how does it work? And then they can't, we can't have that interchange. I don't understand. We can't explain it, which then drives to a lack of business adoption. And so, AI clouds are helping to solve these problems, bridging that communication divide between the technical maker.
As for the business users who are consuming AI, we'll talk a bit about that today about some of the products that we built that helped to bridge that divide. And then our partnership coming in and helping customers really takes them through this overall transformation helps them be successful with that transformation.
So one of the things we see when we come to customers is, hey, I just spent nine months and it might have been a small team of data scientists for a few months working on an AI project that didn't end up getting used or adopted. And what we see when that is like a data scientist starts working on a project, they're not exactly sure what the business requirements were there just sort of trying to figure it out. And they ended up having to toil on a notebook for a long time figuring out trying to get that model perfected over and over again. And when they're trying to get that model in production, they may run into additional hurdles working with it, maybe they have some homebuilt deployment system, there's nothing that just easily puts it in production. And then when they take those predictions, sort of what I was describing before that in terms of my interactions, when they take those predictions, they say, Hey, here's a score for this, they have trouble explaining that to the business user.
And so then it's not necessarily adopted. And if that happens, you take all of these man hours, and sometimes many years of work, and they don't end up actually driving business value. And what we've seen with our AI Cloud is we like customers. I'm not saying every project takes two weeks or two weeks on the side. That's not true of everything. But we actually have a lot of customers who have gone from concept to adopted model in the form of an AI application in under two weeks. And that’s been repeatable. And we have numerous companies doing that.
And what it really starts is now, it can be a data scientist, but it could also be an analyst or developer, a data engineer, who's then working with the business user from the start on a problem that's already been defined. We then have a whole set and we'll go through them today; they have no code and Auto ML services that build really highly accurate models in in days, or even hours. So an example can be they can build that model. Ours bring the business user and shows the important features that are coming out, have a discussion about that, reiterate on it, and get a model tuned up very quickly that the business understands. So then when it goes into, hey, we're going to make this in production, they just one click deploy it.
And then with a low code Application Development Framework that we have, which we'll talk about H2O Wave, they can wrap that model and an application of the business consumer right off the bat. And so they're going all the way from conception to usage and consumption in a very short period of time. And this is this is time to value. And this is one of the main reasons we see AI cloud adoption happening in all types of organizations. Next is once you hit that scale phase of maturity, so saw that maturity curve, so you've gotten some value proven value in the organization and the organization starts to grow, you're adding more layers and leadership, you have executives in the data sciences group, you started having models everywhere, right? So you have Hey, I got a, I might have some models running in snowflake, we'll talk a little bit more about how we support that today, I got some development models, and some of those are with H2O. And some of those might be outside of H2O, I got some production models, specifically in applications or in ML flow, I have some in H2O ML ops are all over. And I need a single system of record to manage that. I need to know if those models are still accurate, I need to know if there's bias that's that we're starting to see in models when they're in production. And then I always want to be continuously getting better. So these AB champion challenger champion models where we're just we're testing, do we have a better model that we have a better model constantly. And so this is another reason we see AI clouds being adopted across the organization is you have time to value the first one. And then this is management at scale, and being able to understand an entire lineage across the model, the model lifecycle.
And so what this might look like in a financial services example, is, hey, we're making the models we then make these applications. And making those applications is great, but then I put them in an app store. And I can see per different department or per different line of business, you know the set of applications and use cases that are specific to them. So when financial services for credit cards that might be fraud detection and consumer banking, we might be looking at personalized recommendations, that enables us interchange again in a closer connection between the technical makers and the business business users who are consuming AI within the organization. Okay, so what is the H2O AI Cloud sort of an overview of why do we see adoption? Why is it why is there so much hype about AI in general and trying to push towards a better Workbench a better toolkit across organizations, you know, what does ours look like and what's in it, okay, so we organize everything we do around making AI operating that AI and then innovating with it, to have it consumed by by the business and added value. Okay, and all that together helps us to democratize and accelerate AI within an organization while maintaining trust and confidence so that you're not making a mistake or taking undue risk while you're going and making AI in a more democratized manner and at higher scale, so we really want to balance making great AI and then operating it and managing it correctly. To ensure business again, business risk is minimized. So we're going to run through everything in here we got H2O3 H2O, Driverless ai, H2O, hydrogen torch, H2O, document AI, we have wave, H2O.ai Feature Store, ml ops, these two a AI App Store, and then a bunch of industry pre built applications that we've built. We'll also talk about the flexible architecture we have. So you know where to run the show AI cloud and how it can work within your current existing environment.
Okay, H2O3, this is our open source distributed machine learning. This is the product that made H2O famous. What it doesn't amazing job of is it makes distributed machine learning simple. And so we support a really broad range of algorithms, supervised algorithms, and unsupervised algorithms and even some text analytics, that, and then there's native clients for both Python and R. So you can work with the familiar languages and IDs that you have, you can use the most popular machine learning algorithms that are out there. And then it's very easy to go and move that into a distributed in memory environment. So you just load, you load the data into a distributed memory pool, and then you're working on that, on that data within that memory pool. So it's very fast, and you can scale it across multiple nodes. And we've done the work to make sure that that's really easy to do. And what that allows you to do is work with big data. So really big data sets, there's no sampling, you're not making approximations. You don't have to write any code to go from a single node to that cluster. And then when you're done with that output, you're gonna get something that we that's what we call Mojo. And so this is a model optimized object that you can really run anywhere. And we'll talk a little bit more about the places you can run that we're showed that slide earlier about all the places a model can live that you have to manage. But that Mojo is a deployment ready artifact coming straight out of H2O3. And then we also have a package for automatic machine learning. Within H2O3, this is one of the reasons why it's so popular. And it's really the standard for any type of distributed machine learning that's going on across any organization. You know, when we talk to our customers, and we say, Hey, are you guys using H2O? The answer is almost always yes. And we see H2O3 used for all these large data use cases. So Driverless AI, this is our award winning auto ml product. It really focuses on on time, talent and trust across an organization so that models can be built much more quickly. They're built in a way where they're already utilizing the best practices that we've learned through years and years of working with customers and optimizing models for a whole range of use cases. And then being able to dive really deep into those models to ensure that we have trust in them when we're putting them into production. And that we can also explain them to business users. So how does it work? Similar, you can take data from really any source, and you're going to load that and drag and drop it into H2O, Driverless AI. It has a range of capabilities, Vinod will be showing some of that later in the demo. To help you understand, dive into the data, we do automatic visualization, and many other items to help you look you can do, you can do some ID, it'll handle missing values. It'll show you data shape, it'll help you understand outliers. And then we have a whole set of recipes, you can run against that, for a set of transformations and so forth to help get the data ready for machine learning. Then, with the automatic model optimization that we do, it will iterate across 1000s of possible models, it will engineer features itself, and then run those back through all of these different algorithms to find the best combination of feature and algorithm for that. And then again, it will output a model that's deployment ready and can be run anywhere. And with its integration into a product, we'll call what we'll cover later Hu ml ops, you can just want to click that into deployment. So why is H2O Driverless AI the one getting chosen for auto ml? Well, first and foremost comes down to accuracy for a broad range of use cases. They are automated feature engineering combined with our genetic algorithm. It's just producing really outstanding models very quickly. On the flexibility side, it's really highly extensible and configurable. You yours can completely control and augment Driverless AI by writing their own workflows, models or scores. And then the H2O models are already deployment ready, it's already talked about can run anywhere. On the trust side, can we have a suite of automated techniques that help to explain the models. So again, you're in, you can see how it's working, you can understand what's driving the model in a whole range of different scenarios. And then we also simulate scenarios ourselves so that you can it helps to prevent overfitting. And they're also checks the models for robustness. So we're not just always looking at okay, well, is this the most accurate model which we can get to my overfitting, we want it to be a really strong robust model that's built well, leveraging all the best practices of the internal expertise, and then also still be highly accurate and then flexible to deploy into your environment.
H2O Hydrogen torch, this is our no code deep learning product that we launched a few quarters ago, we're seeing great traction off of this product. Now the whole problem is, look, there's a lot of this unstructured data, whether it's in images, or video, or audio or text, but the talent to be able to go in and tune all the parameters and build the models in all these cases, isn't aligned to the amount of data that there is. And so this is our effort to go and democratize deep learning, and bring deep learning to all data scientists and even non data scientists, we will build these types of models. Okay, so what we've done is we've optimized around a certain set of problem types. And we keep expanding this. So through the expertise on our team, we have a group, so you basically just select, hey, I'm going to be doing a classification problem for images. And then you're going to upload, we'll talk about how that works. The set of the set of images that that you'd be using, and then you're going to train the model, and then you can call that model as an endpoint and start using it. So from a text perspective, we support classification and regression. We also support token classification span prediction sequences, we sequence and metric learning. And then on the images and video side, again, classification or regression. And then we also do object detection, semantic segmentation, and metric learning. And then what's new is audio. So right now we support classification and regression for audio. And we're continuing to add to this list. And so if you guys go into the chat, or the right now, q&a, if you let us know, hey, here's all the ones that that we would like you guys to support. That'd be great feedback to get. Okay, that's number one. Two is, if you have projects upcoming, we'd like to know like, what is your next project look like? What are you going to which ones of these are the problems that you'd be working on? And that just helps us to focus and ensure we're delivering for you guys as
much as possible. Okay, what's
what's really different? Well, number one, the no code fam framework, which we talked about. So you can build these really highly accurate, robust, deep learning models. Without writing a line of code. We support a broad range of problem types. And those best practices from what our grandmasters have learned working on these types of problems for many years are built directly into those problem types. And then we enable not only do we provide a a optimal model parameter, we also enable customers to go in and you can iterate, and you can still control elements of the experiment to ensure you're getting to the results, and you're able to still tune in the models that you're working with. And then again, it's very easy and flexible to deploy, what you'll see is common across all of our products. Right. Another one of the engines we have to make AI is H2O, document AI. And this helps this is all around building intelligent document models. So similar solution space, there's a huge number of documents out there that we want to understand and process and manage. And so what the product does is it provides a initially a user experience where you can pull the data and you can label it and then start and then train on the label data that's going to produce a set of models. And then those models are easily integratable via API's into other workflows, data stores and applications. And then we also have built in a way for humans to be in the loop. We'll talk a little bit about UCSF health and what they did. If you saw him in the in the pre video, you got even more of that story. And we have that story on our website. You'll see that they did build a human in the loop end. So if there were errors coming out of the process, they human would correct it that would go back into the training set. They'd retrain it and the model would get better and better and better as it is not machine learned but machine learning. This has worked very, very well for them. are around the very large number of faxes that receive every year. So, traditionally, with optical character recognition that really works for extracting information from documents where the template doesn't change, and the organization has control that the template will change. And there are some use cases like that, and document a does a great job on those use cases. But where it's really different, is that it can take for any use case where the document varies a lot, such as medical referrals, talk about in a moment, supply chain contracts, purchase orders, where you're gonna get a lot of different formats that these documents come in, we can help recognize from the format, even though it's different, what type of document it is. And then based on what type of document it is, what actions to take in terms of what information to pull and where to put it, and then some workflow elements later. And that's all built into H2O document AI, making it an amazing choice for all these use cases, where the traditional methods of doing this either manually or with OCR are producing are either really expensive, or producing really poor results. So at UCSF, what they did is they said, look, we got 1.4 million faxes coming in each year. And we're having to deal with these manually, and they had more traditional solutions, but they're having to put a lot of manual effort. And on top of that, and what they ended up doing instead is say, Look, maybe maybe we can do this, maybe we can, let's see if let's see if we can do this with H2O. As Bob said over into the quote, on the left, some of the industry told us that it couldn't be done. And so they were we were able to work with them use document AI, were able to handle a very wide variety of templates of the 1.4 million faxes coming in, find the referrals, the referrals also had multiple, different templates, and then build that straight into their workflow for that now when a doctor says, Hey, we need to have this person referred to a different doctor. And they're sending that via fax, that whole process is automated. And so individuals are getting referred much quickly, much more quickly. So yes, when we look at the value, we're like, oh, they they save 25,000 hours of staff time. And that's great. You know, we also look at the value saying, well, now we're actually able to connect patients who need care to the doctor who can provide that care much more quickly. And so that's, that's the those are the ways where I think AI is really, really adding societal value. And that we're able to help organizations do that to the products that we're making in our AI cloud.
I think you read, let me share my screen over here. I'm going to show a couple of examples of what's possible with the show, right? So for folks who have not known about this famous, basically local development framework, and for data scientists who might be familiar with, let's say, with our shiny streaming department side, if this will be like this, maybe feel like much, much more powerful and more flexible framework to build your rich applications, data rich applications. So a couple of examples first is basically a hospital occupancy simulator app that we built for some of our large healthcare providers. What this allows us to do essentially, is to look at a particular metro or county and for that predict the COVID cases or occupancy or admissions in the next few days or weeks. So for example, I can here it's looking at telephone Alameda County, I can go and change it, let's say pick a different county in Napa County, for example, and I can see the real time predictions being updated. And I can see what the change in the different occupancies left of states ICUs etc. I can also very easily simulate for example, this is the base simulation based rate that we know from the data but let's say this particular hospital wants to change the releasing that hey, we are seeing really high admissions and occupancies can you run a simulation for us you can do that very easily changed the model and the model goes and updates us so surreal, simple example, but very rich. application that can be used directly by hospital providers. And one other example, which is basically a transaction fraud one. So this is an application that can be used by let's say, a bank, or financial services where they want their agents or risk analysts to be able to look at transactions coming in flagged them as fraud or fraud or like, you know, highly suspicious, and then have them inspected, so you can see transaction as they're coming in. But the problem the really high probability or products, I can go pick up a transaction, I can see why the model thinks it's a fraud, it's giving the reason codes, for example, saying that there is fraud, because of these particular risks, we make the screen a little bigger. So I can look at the most important factors, I can look at what the distribution is, as compared to the common age group and other demographic patterns. I can also go and diagnose the model, for example, right, I can change the type of model I want us to change it up a score, I can optimize it for a different parameter, for example, I can change the frog threshold, like adults, we say. So as you can see, and as I do that, all these good applicants, the nice thing about this whole app is that lot of these competencies, see our widgets that are available natively. So you don't as a data scientist, how to write front end code to do this, you can generate these, which visualizations and interactive elements very easily from the VB toolkit itself. So hopefully, you'll take it for a spin and we'll look for to see what you do with it. Or do you read?
Alright, thanks for now, as you can see, you can build really powerful applications very quickly with H2O Wave, and do it all in Python, the data streams in real time, it can update in real time, it's very good at that runs very quickly. So you can build these really powerful applications quickly. And you can deploy them very right out of the App Store. And they can run anywhere as well. So very flexible in terms of the architecture. Right, moving over that. So that was everything around me, we just covered H2O3, Driverless AI, hydrogen torch document, AI and H2O Wave. And moving over to operate, we're going to cover feature store in ML ops next. So the H2O.ai Feature Store, you know, one of the reasons we built this product is we kept running into customers. And they'd say, Well, we built, done all this work to build these features, done all the data prep and created the feature engineering pipeline, we now use these features in a set of models, but they're siloed and other data, scientists can't find them. And with the HOA feature store, we pull all that together. So now all these great features that the data scientists are using, we call this internally, it's like the watercooler for data scientists come around, and they can see the features that others have built to solve similar other problems around the organization, and then use those to improve the model. And so this is a product that we co created with AT and T and I talked to Mark Austin, their VP of data scientists over there pretty frequently. And he's like, there's almost I don't think there, he just did it. In another meeting last week, where he was talking about the data scientists were working on a problem for a couple years. And he's like, oh, let's just let's just go over here, let's go shopping, let's go shop, he calls it shopping in the feature store. And they went over there and they spent, they spent an hour or two searching through the feature store looking for features, looking at all the metadata that's associated with them. And sure enough, every time he comes back, he's like, we increase the accuracy of that model by 17%, we increase the accuracy of this model, by 35%, we increase the accuracy of this model by 45%. You know, it's a continual basis of they're going in, they're finding these additional features that are reusing all of this incredible work that the data scientists have always done, have already done in applying them to other business problems that are occurring in the group. So really excited about about this innovation for the company. It's a composable architecture. So you can bring in streaming or batch data, we've already have pre built integrations with Driverless AI data, bricks, snowflake, Apache Spark, and a whole set so that these engineering pipe feature engineering pipelines you've already built will work directly with feature store. And if you have one where we don't have a pre built integration, we have API's, and you can integrate with it quite easily. And then we have an online and an offline engine. And then you just connect that over into the applications that you're already running. And you can also use it for real time inference. So why the H2O.ai Feature Store number one scale, you know, we built this with AT and T to handle some of the largest data sets that exist out there. They do many petabytes a day over their network. Just to give you you know, I think everyone would, has a good sense on the call of how big at&t is. But it's built to handle terabyte scale data in the offline the performance online so when we do on the online data store is engineered to do sub millisecond latency for real time. inferencing security is first and foremost and the designs is a P zero. So it honors all data and access rules for full fledged permission. And it also does encryption of the data and does masking for sensitive features. And that's all supported natively within the product. And again, it's really flexible in terms of the clients and the feature engineering pipelines that can integrate with an H2O ML ops as well use for model registry deployment, scoring and monitoring. And so say that these are the core, these are core problems when when organizations get to scale, which is the deployment needs to be seamless, very quick. But when they do that, they need to make sure that they can monitor and understand not just the standard software monitoring, but you know, how is the model performing? Is it drifting? Is, is there bias coming into the model? So there's other types of monitoring we want to use on those models. And then there's struggled with lifecycle management is understanding, you know, the is it maintaining accuracy? And then if we've updated the app, then what's the lineage going back? Which is related then to governance, which is, can we see the entire lineage of how the model has gotten built? And then if we've inserted a new model from a champion challenge, perspective, how do we then also have lineage back and connecting that version, like v1, v2, v3, of those models all the way back, so governance becomes a really important item.
And so this is how this is how we built a to ml ops, it supports a whole range of frameworks, not just H2O, like pytorch, TensorFlow. So other types of models work with H2O ML ops, we have a management and registry, we have deployments and scoring. So again, you can deploy it in a single model AB testing, or champion challenger, you can deploy it for synchronous, asynchronous or batch scoring. And then you have a whole range of environments, whether you're doing that Dev, prod or some other type of custom environment that you've built. And then on the monitoring side, we monitor, we monitor for feature drift, accuracy, bias, and then we have a whole set of other operational metrics. And then we also allow you to set some custom thresholds within those operational metrics to help manage help with the management and make it even easier. And then the API's we have tie ends with a full set of AI applications, alerts and messaging, and then other analytics tools. And our customers are choosing ml ops, because number one, it's open and interoperable. It's integrated with a broad set of AI tools, and you can score models anywhere, so you can score the models directly in ML ops. But because of the flexibility of the Mojo that you were talking about earlier, that's our model output. And we'll we'll also talk a little bit about how you do this in snowflake is you could score directly in an environment like store snowflake, and still govern and monitor the models from H2O ML ops, we have the rightest range of deployment options that I was just talking about. And then we have Enterprise governance, so that end to end model lineage, and that identity and access control as well as model lifecycle management is required for governance at scale, as organizations grow in their AI endeavors.
Moving over to the Innovate side,
on the App Store, really what we what we've done here is that enables organizations to publish, share and collaborate on AI apps to drive business value. So the makers, they create, and they published H2O Wave apps, we've already we've already talked about each to a wave. And then the users can go and find those apps. And then they can deploy them very simply, and run them and use them. And so the App Store is this area of not just finding, but again, it helps you to host and run the applications as well and provides that bridge between the AI user start AI maker and the AI user. We've also built a huge range of AI applications. So between things that help with data sciences, best practices around model validation, adversarial model robustness testing. And then on the financial side around things like home insurance, churn or credit card risk. And Vinod was was demoing transaction fraud. on the healthcare side, we were looking at a COVID 19 forecasts. So things are on chest anomaly detection. So what we're doing is in all of these, there's a range of applications that customers can come in for use cases that have already been worked on and have a huge head start and using that in their organization, to drive business value even faster. Now, one of the key things about H2O AI Cloud is how flexible the infrastructure is. So from an API perspective, we have we have integrations and deep partnership with some of the some of the companies you see on the left. We also can just integrate with 200 Plus data sources out of the box that we already have pre built integrations with and then via API's, you can integrate very easily with other business applications. And then we just talked about how to integrate with H2O Wave specifically, we're cloud agnostic, you run any cloud any Kubernetes Is environment so you can set up environments that aren't also Kubernetes. On Premise, if you're running your own cloud, and scaling elastically that way as well. Those are two offerings. So we have a fully managed offering. This is where we handle the infrastructure, we handle, all of the updates are always on the latest version of of our products, and you're just focusing on making, and we're handling all of the undifferentiated, heavy lifting all of the muck to allow you guys to focus that's over on the fully managed cloud. In some organizations, they want full control over their environment. And so that's in the hybrid cloud. So they have full control over how they want to set that up, they can achieve different, you know, whatever their compliant, compliance needs are within that environment. And they have that choice between I want to run in this specific cloud or I want to run on prem, they can they absolutely can do that. One of the you saw the slide earlier in a slightly different context. But in terms of flexible infrastructure, because of the object that we're creating. And because of some of the partnerships and other software we've written on top of the object, you can really run and score these models just about anywhere. And so we're then able to take the ones that are directly running an ML ops, but you can also have models running in multiple places. And then we're scoring those, and we have insight into what's going on to help from management and a governance perspective. It's also allows us to do AI on the edge. So this is in terms of building a model, we can build models that run in any Java environment, they support real time, real time scoring, they'll return that with sub millisecond latency, and then the model is going to be built are going to be really lightweight to run on a whole range of different devices. So we looked into snowflake for a couple slides here, what happens with snowflake is, you can you can train the model directly within the snowflake UI, you can take that trained model as part of the AI cloud. And you can move it the scoring artifact actually directly into snowflake. And you can score directly in snowflake. So if you are if you have snowflake, your snowflake user, we have deep integrations with them, we have a whole page on our website for you to go if you want to learn more and go deeper, we also have a session in a couple weeks that Eric on our team is going to run where we're going to dive deeper into our snowflake integration. So stay tuned for that or go back to the make with H2O page and register for that session if you haven't already. Isn't example of that of doing the training and scoring within snowflake is direct mailers is by using H2O, they sell over 60% performance Gage, and they would sell 40% improvement in their utilization of their resources. And most important to them was how much how much faster they're able to access their data in terms of the data that they're they were processing with machine learning.
We saw a similar within Futura where Gary Walter their CEO said that H2O reduced their AI inference costs on Snowflake by 55x. And that the partnership and support has made the integration that much more seamless and easier to lover. So if you're using snowflake, again, something you can go try out. And I think you'll see this see the same type of performance and efficiency gains, working by having by being able to train within the UI, and being able to score directly and stuff like okay, we're gonna, we're going to pause for a brief poll. And then we're going to switch over and we're going to jump into an end to end demo with the net. So we want to understand because we're running these every week, we have a whole lot we have a lot to cover is what do you want to learn about? So if you could just take a minute, answer what you want to learn about. Next, we're going to be focused on those topics already. What's coming up next, as we have NLP next week. I'll cover a little bit more about an Innovation Day we're running July 14, the week after that, we're gonna be covering snowflake, and then we start our accuracy masterclass. So, we have a range coming in. And then we're going to keep adjusting those based on feedback from the community about what you guys want to learn about, and then we'll hold these sessions. We're also going to be experimenting with formats on these sessions too. So it's not just us we do have a lot to cover in just terms of the breadth of the platform on this session, but we'll also be breaking it up and trying to do a little bit more discussion groups and some other things and some future versions that we're working on. Okay, great. Do we share those out looks like we got NLP is up at four is one of the highest here we got it. We got two other ones that are pretty close to next there which unsupervised machine learning and feature transformation. So those are the top three. So NLP is the next session. If you're on here and you haven't registered for that, please attend. And we'll take the note of feature transformations. But we are going to cover that in our accuracy masterclass. And that's the second part of that. So that should be up in about six weeks ish. We should be covering that topic and unsupervised, we need to add to the list. So Thanks for Thanks for that feedback. All right, Vinod.
All right. Thank you, Reid. Let's go ahead and jump to the demo real quick. Let me start with the App Store. So when it comes to the API call, and we already talked about all the pieces that read mentioned, but add Hello, this is what you'll get when you sign in. If your insurer AI slash free or free trial, sign up for an account, you log in, you'll see something like this is basically the App Store. It has the App Store, the app cloud, the cloud console, and all the agents that are mentioned earlier. So obviously how you can go in navigate into the App Store, take a look at all the different apps that are available, you can take one of these and start working on them. Or you can come and launch with the machine learning workflows, which is what we're doing today, they're just apparently going to a very, very quick workflow of what it means to start from a dataset go all the way to modeling and then deployment model into production, right. So a short drive is your place. To start off, I'll click this and you know, have a tab open to this. Basically, you come to a short drive, you can see all the datasets that you have available. It's kind of your sort of pocket s3 on the iCloud, you can obviously import data from a whole bunch of different doses. And you have like or two other data, different data sources, like we mentioned earlier, you can also configure one of these connectors. So if you want to basically import your data from Let me refresh this. And imported data from let's say, snowflake, which you mentioned earlier, right. So you can add your credentials, you can put in and log in and save the screenshots. And the next time you come in and lay available over here. Once the data is in here, you can obviously go look at the datasets that are available, we have a bunch of templates available. So I'm gonna I'm going to do is I'm going to jump in to Driverless AI and the way you can watch over the car, you can go and launch this quickstart over here. And that will basically this launch of Driverless instance for you can also launch an h2 instance very easily. But we'll come to our Driverless instance real quick. And once you are here, this is the WCF. For folks who have seen this before. If you're new to this, you'll see basically a bunch of tabs of top projects, datasets, experiments, and so on. So I'm going to start with the Projects tab. So you have a project, typically sub the project Create Project will add new datasets to that and then start your experiments or in this case, we already have project in place. One way, just mentioned how you would import data into Driverless AI, you have multiple sources, the easiest way like if you're an AI cloud is to go hit this H2O drive link. So you can see those datasets that we saw earlier, the drive, those will all show up over here, same data sets, right, and you can import all of them one at a time, you can also use one of these connectors directly to pull in data. So I'm going to I've already put a dataset over here, so I'm not going to run that again. But once the data set is in here, you'll see the stages showing up, you can do obviously a couple of things, you can quickly look at the details, you can see what the data looks like, the different columns in there, the data types and min max median summary statistics, you can also look at the raw rows themselves to see if that makes sense is that is what you are hoping to pull this case, we have a telco chain data center very popular. So so it cannot be said that but that has a bunch of information and what subscribers, and the goal is to predict whether someone is going to churn or not based on all their subscribing subscription pattern patterns. And a couple of things you can do over here, you can start visualizing, you can predict and do it. But before that you can also transform this dataset. So we have a couple of options to do this. First is you can you upload a data recipe, and we have a whole bunch of recipes. And you can write your own recipe recipes are basically just Python code that you can use to transform a data sets, you can also apply an existing recipe. So for example, I have a data preview recipe that I can hit. And that data, it's a very, very simple recipe, all it does is picks up the first four columns, and then shows that for me, I can just go and look at that particular preview, right, so apply a transformation. And that can be done. Let me go back over here during the reset. I can also apply live code. This is nice because I can write basically Python and pandas code to transform a dataset. And once I'm done, I can save that transformation in USB for future purposes as well. So it's really nice, easy to do data transformation.
Let's close that out. Let's go and actually build a model. So I have actually created a training test set set over here. So I'm gonna just go and click that and start to build the model. When I hit predict, I get this option that tells me okay, it automatically it's picking supervised unsupervised economics to change which type of modeling I want to do in this case we are doing supervised learning, we're going to predict whether someone's going to churn on the target column in this case is the one at the bottom chunk and I'm going to use this in the the classic auto No more, I'm going to let it to the machine, figure out everything. So you can see figures out. Based on the parameters, it says, Okay, here's what I'm gonna do, here's the kind of model I'm going to run, here's a kind of free changing, I'm going to it's building basically a plan on how it's going to solve this problem, right? You can obviously tweak this, you can change everything. If you go to expert settings, which I won't go today, you have a whole host of settings if you're an expert to cricket. But if you wanted to just let the machine do it, or whatever, we'll do all the work for you. Right, it picks up the right accuracy, time and data settings, it picks the right score for you, you can change that, of course, you can pick a different score if you want to, and then hit run. So in this case, I'm just gonna hit run, I'm gonna let the machine do all the work, and I'm gonna hit run, right, let it start building the model, it's going to do a few things. First, it's going to check the schema, it's going to check if there are any issues in the data, we call it basically follow on it's already checks their leakages, the data that are drifts, the data, the missing values, for example, it does all that work for you, right pre processing work for you make sure that the data is actually good, if it has some issues, give you warnings, or at least go and fixes problems for you. And as I said, starts building models, it starts you can see the first volatile showing up. And we'll go through a whole bunch of iterations and integrations, it will do some, it'll do some feature engineering, or try to create new features, transform existing features, make it more compatible for machine learning, try to eke out as flexible as it can, right. And the goal is our after a few iterations get the best of all possible tires accuracy. And within just some time, I'm going to go into an experiment, same data set I ran earlier, a classical Bookstart, we're just going to go this telco webinar dataset, same experiment. I think this is a little while earlier, right? So I ran this and finished experimenting, trying about eight or nine iterations after I figured that was not really getting any lift. So close the experiment. And you can start seeing what the different parameters were right. So it seems that the contract of a person is probably the most important feature that determines whether someone's mature or not. It also tells you if they did contact tech support, then other response to export is an important factor. That's interesting. So baby can have a good sense. But that helps you that. So they'll see all the information. Now, I want to see more about this model. And I want to see actually, what is this model actually doing right. So I want to go first independence part. So how much of ability to show interpretations. I can also run interpretations. But I'm going to just show the interpretations which are already random. So what drivers did for you is basically random with 14 different interpretation techniques. These include a bunch of bread techniques, some surrogate models, some other techniques, like bisection, sets devices. And this is a rich set of analysis to help you actually find out what the actual model is doing. Right? I'm just gonna pick one over here, right? So this is the Shapley values, the original features, it tells me what we saw earlier. So contract tenure, its service amount on its security, these are the top parameters that tell me whether someone is going to churn or not. The nice thing is I can actually go and pick up like a single row. So I can say that give me row number 100. Right, if I know subscriber number 100, I can go search for that particular person. And I can see how that person's values differ as compared to the global model, right. So in this case, this person has a different value for the two year contract, they have different values for all these things. So they're very different from the average user. But that also tells me why they are interesting, I can take a look at that person's actions. If this person is unique or not, right, I can obviously run disparate impact analysis, this helps me better my model is biased against certain sensitive features. And that's really nice to see, I can go and see, hey, I want to make sure that the model is not biased against gender rights, I can quickly see this case, it's telling me that the model actually isn't right. So it's saying that there is no bias, it seems to the model, a model seems to be fair, but if it is not fair, it will show up over here. And then it can take some corrective action. So it's nice, I'm just going to jump back to the experiment real quick. That time.
One other thing you can do, which is really interesting is you can actually go and look at the individual recipe. So if I click this, it'll actually create take this particular experiment, and I'm going to just show you jump to the preview of that, it'll actually create a recipe out of that, meaning that this is a Python code that the model generator offered and that you can use this code, run that experiment again, and again, I'm going to notebook and you can recreate the entire settings, all the settings for drivers are all over here. So that's really nice if you want to put it into yourself CI CD pipeline, for example. Right. So that's really nice. Now, let's go back to deploying the model. And so this is the key part, right? So to deploy a model I have once it's in my project, so you can see this is my project. The datasets are over here. And these are the two experiments, I have linked them up. Now let me jump to my ml ops tab. So this is I go to ml ops go to all the projects for projects listed over here. The project that we are most interested is the telco chairman, which we just saw earlier. So let me go into this project. I'm inside the project screen a little bigger, so it's visible. So I can see that there are basically two experiments that project like we saw earlier, right, two experiments. There are there's only one model to know to register a model basically I go to an experiment, I click on experiment, I can see that there is one model is that this experiment has a model. I cannot register this model To into ml ops. And that's really an important step right. So this allows me to basically say, I want to bring this model in, I want to use this for Model Management or modify Max, I will refer to this as a new model, call it telco webinar model.
Leave the description blank and register this, no, my model is successfully registered. So now I have gone to my models tab, I can see both those models showing up right now this is nice, I can see the model I can look at when the model was registered. Now, what the third step is basically now being able to deploy that model. So I already have one one deploy, I'm going to create a new deployment. So I'm going to call it title webinar, the same name, I'm going to pick the more in this case, it's a single model, I can also have a very do hybrid models, but I'm going to pick the Select model, right. So select model gives me the two models that have available, I'm going to pick the webinar one, that's the one we're deploying, I can pick the deployment tab in real time or batch will pick real time, we pick the environment, this case, I'm going to call it prod. And then I can pick a bunch of parameters, right, I can pick the security level, I can pick the runtime artifact, this is critical. So I'm going to pick the artifact in this case, I want the module to be deployed modules, basically h2 artifacts optimized for scoring, I'm going to pick the module runtime. And our nice thing you can see is there's a whole bunch of different runtimes that are supported, right. So we can support third party models as well, very easily we can support Python scoring pipeline. So any type of model, you bring it ml ops, you can deploy, it is going to leave the security in this case, no security. Everything else is optional. So I'm going to hit create employment. And now it's going to start the deployment model. So as you can see, it's going to still pending and take a few minutes, what it does is it starts spinning up a pod putting the REST endpoint. In the interest of time, let's pick the one that we deployed earlier, and tell you what it looks like. So you can see this model has been deployed, it was deployed earlier. Right now, there are no alerts on this, this is the endpoint I can scroll to, I can also get a sample call request, I can go run this, I hit this call request, I can see predictions coming through. Right. Very nice. That let me close this out, and then get back to my AI cloud for a second. So that's basically what he saw. So here's what he did, right? We be horribly tedious and into insured ribosomal travels AI, then we were able to deploy it with our ML ops, we can do something similar with hydrogen torch as well in the future sessions will cover that we have H2O3 coming up. So you can do that as well. So all these AI engines work similarly, you can do that work. I'm just gonna say one last thing before we end the session, which is these notebooks now. So this is nice. So I can basically have a Jupyter Notebook running in as an IDE in my cloud. And what I'm going to do is I can open up one of these notebook instances, and have basically an end to end workflows, all the things we did in the GUI can all be driven through the notebook. So basically, this is this one notebook and run the whole thing allows you to basically create an AI engine basically, which is basically launch a browser engine, launch Driverless AI imported data, set the same chalcogen dataset, run an autumn experiment, just like we did for the UI, and then deploy that model in the labs, collective labs, get the deployment, the deployment, view the information about the deployment. So all the things we learned from the GUI can do it from the Python notebook as well right or from time to time. And what this means is that you can now take this Hunter notebook or pipeline, and pre DCCD, if you want to automate, automate this whole process or programmatically control your modeling workflow, you can do that as well. So you basically went from data creation to automate marbles. And one thing we didn't create is the feature store X. You can also import from feature store, build models deploy and integrate. So that's in a very, very quick nutshell, what AI cloud allows you to do, right? So it's a very rich platform. And it is a full as a full ID for building models, both with the code first interface, also with no code interface. And data scientists and business users and data engineers are developers, they can all come in collaborate very easily. It's a collaborative workflow and caring environment. And all these apps are helping you to achieve sort of different objectives and different parts of your AI lifecycle. With that, I'm going to hand it back to read to close up the session. Thank you
all right, thanks, Vinod. I mean, everything Vinod just showed you can go try yourself in our free trial. In terms of doing those end to end workflows, we have a tutorial in there it'll take you through you know, building a churn model deploying it and many of the other items that Vinod just just showed, but you can make with it what you will you have open access to, to build and test out all the capabilities within our cloud. And then just to leave you with we have a bunch of upcoming events. One of the big ones is July 14, we'll be hosting our H2O Innovation Day. So go check that out. Please register we're gonna be covering everything that We've launched in the last quarter and doing all the new innovation. We're also going to be sitting down and talking with our CEO and a set of customers in that about what they're seeing and doing what they with, with H2O. And then we have a whole other set that we talked about, we got NLP coming up. Next week. We have we have a document processing meet up in Singapore, London meet up and then we're gonna talk about snowflake and choosing the right metric from an accuracy masterclass in a couple of weeks. The week after that we'll be posting and made the decision on we're going to have H2O document AI and getting started with H2O document AI is the focus. So stay tuned for that, watch emails coming through about the program and sign up for that one. Just want to thank everyone for attending today. Really appreciate it. And again, if you have questions, something comes up, let us know and and we'll, we'll get we'll get it all sorted out for you. Thank you. Thank you