H2O.ai Blog
Filter By:
121 results Category: Year:Agents | Building your first Agent step-by-step with h2oGPTe & LLM Chains
Fine Tuning The H2O Danube2 LLM for The Singlish Language
Singlish is an informal version of English spoken in Singapore. The primary variations lie in the style and structure of the text, and inclusion of elements of Chinese and Malay. Though Singlish is the common tongue in Singapore, it isn’t well defined or formalized. We fine tuned H2O.ai’s Danube-2 1.8B LLM on Singlish instruction data, wi...
Read moreAnnouncing H2O Danube 2: The next generation of Small Language Models from H2O.ai
A new series of Small Language Models from H2O.ai, released under Apache 2.0 and ready to be fine-tuned for your specific needs to run offline and with a smaller footprint. Why Small Language Models? Like most decisions in AI and tech, the decision of which Language Model to use for your production use cases comes down to trade-offs. ...
Read moreBoosting LLMs to New Heights with Retrieval Augmented Generation
Businesses today can make leaps and bounds to revolutionize the way things are done with the use of Large Language Models (LLMs). LLMs are widely used by businesses today to automate certain tasks and create internal or customer-facing chatbots that boost efficiency. Challenges with dynamic adaption of LLMs As with any new hyped-up thi...
Read moreA Look at the UniformRobust Method for Histogram Type
Tree-based algorithms, especially Gradient Boosting Machines (GBM’s), are one of the most popular algorithms used. They often out-perform linear models and neural networks for tabular data since they used a boosted approach where each tree built works to fix the error of the previous tree. As the model trains, it is continuously self-corr...
Read moreTesting Large Language Model (LLM) Vulnerabilities Using Adversarial Attacks
Adversarial analysis seeks to explain a machine learning model by understanding locally what changes need to be made to the input to change a model’s outcome. Depending on the context, adversarial results could be used as attacks, in which a change is made to trick a model into reaching a different outcome. Or they could be used as an exp...
Read moreH2O LLM EvalGPT: A Comprehensive Tool for Evaluating Large Language Models
In an era where Large Language Models (LLMs) are rapidly gaining traction for diverse applications, the need for comprehensive evaluation and comparison of these models has never been more critical. At H2O.ai, our commitment to democratizing AI is deeply ingrained in our ethos, and in this spirit, we are thrilled to introduce our innovati...
Read moreReducing False Positives in Financial Transactions with AutoML
In an increasingly digital world, combating financial fraud is a high-stakes game. However, the systems we deploy to safeguard ourselves are raising too many false alarms, with over 90% of fraud alerts being false positives. These false positives, not only frustrating for consumers but also costly for financial institutions, can eclipse t...
Read moreWinner's Insight: Navigating the Parkinson's Disease Prediction Challenge with AI
Parkinson’s disease, a condition affecting movement, cognition, and sleep, is escalating rapidly. By 2037, it is projected that around 1.6 million U.S. residents will be confronting this disease, resulting in significant societal and economic challenges. Studies have hinted that disruptions in proteins or peptides could be instrumental in...
Read moreH2O.ai and Snowflake Enable Developers to Train, Deploy, and Score Containerized Software Without Compromising Data Security
H2O.ai today announced its participation as a launch partner for Snowflake’s Snowpark Container Services (available in private preview), which provides our joint customers with the flexibility to train, deploy, and score models all within their Snowflake account. This further expands the ease of use for data science teams to create machin...
Read moreH2O Releases 3.40.0.1 and 3.42.0.1
Our new major releases of H2O are packed with new features and fixes! Some of the major highlights of these releases are the new Decision Tree algorithm, the added ability to grid over Infogram, an upgrade to the version of XGBoost and an improvement to its speed, the completion of the maximum likelihood dispersion parameter and its expan...
Read moreGenerating LLM Powered Apps using H2O LLM AppStudio – Part1: Sketch2App
sketch2app is an application that let users instantly convert sketches to fully functional AI applications. This blog is Part 1 of the LLM AppStudio Blog Series and introduces sketch2app The H2O.ai team is dedicated to democratizing AI and making it accessible to everyone. One of the focus areas of our team is to simplify the adoption of...
Read moreH2O LLM DataStudio: Streamlining Data Curation and Data Preparation for LLMs related tasks
A no-code application and toolkit to streamline data preparation tasks related to Large Language Models (LLMs) H2O LLM DataStudio is a no-code application designed to streamline data preparation tasks specifically for Large Language Models (LLMs). It offers a comprehensive range of preprocessing and preparation functions such as text cl...
Read moreEnhancing H2O Model Validation App with h2oGPT Integration
As machine learning practitioners, we’re always on the lookout for innovative ways to streamline and enhance our processes. What if we could integrate the power of language models into our workflows, especially in the critical phase of model validation? Imagine running validation procedures, interpreting results, or even troubleshooting i...
Read moreDemocratization of LLMs
Every organization needs to own its GPT as simply as we need to own our data, algorithms and models. H2O LLM Studio democratizes LLMs for everyone allowing customers, communities and individuals to fine-tune large open source LLMs like h2oGPT and others on their own private data and on their servers. Every nation, state and city needs it...
Read moreBuilding the World's Best Open-Source Large Language Model: H2O.ai's Journey
At H2O.ai, we pride ourselves on developing world-class Machine Learning, Deep Learning, and AI platforms. We released H2O, the most widely used open-source distributed and scalable machine learning platform, before XGBoost, TensorFlow and PyTorch existed. H2O.ai is home to over 25 Kaggle grandmasters, including the current #1. In 2017, w...
Read moreEffortless Fine-Tuning of Large Language Models with Open-Source H2O LLM Studio
While the pace at which Large Language Models (LLMs) have been driving breakthroughs is remarkable, these pre-trained models may not always be tailored to specific domains. Fine-tuning — the process of adapting a pre-trained language model to a specific task or domain—plays a critical role in NLP applications. However, fine-tuning can be ...
Read moreHow Horse Racing Predictions with H2O.ai Saved a Local Insurance Company $8M a Year
In this Technical Track session at H2O World Sydney 2022, SimplyAI’s Chief Data Scientist Matthew Foster explains his journey with machine learning and how applying the H2O framework resulted in significant success on and off the race track. Matthew Foster: I’m Matthew Foster, the Chief Data Scientist for SimplyAI. So, I’m going t...
Read moreImproving Search Query Accuracy: A Beginner's Guide to Text Regression with H2O Hydrogen Torch
Although search engines are vital to our daily lives, they need help understanding complex user queries. Search engines rely on natural language processing (NLP) to understand the intent behind a user’s query and return relevant results. By formulating a well-formed question, users can provide more precise and specific information about w...
Read moreExplaining models built in H2O-3 — Part 1
Machine Learning explainability refers to understanding and interpreting the decisions and predictions made by a machine learning model. Explainability is crucial for ensuring the trustworthiness and transparency of machine learning models, particularly in high-stakes situations where the consequences of incorrect predictions can be signi...
Read moreH2O.ai at NeurIPS 2022
H2O.ai is proud to participate in the 36th Conference on Neural Information Processing Systems (NeurIPS) 2022, one of the biggest and most prestigious international conferences in artificial intelligence. NeurIPS 2022 will be a Hybrid Conference from Monday, November 28th through Friday, December 9th, with an in-person event at the New Or...
Read moreA Brief Overview of AI Governance for Responsible Machine Learning Systems
Our paper “A Brief Overview of AI Governance for Responsible Machine Learning Systems” was recently accepted to the Trustworthy and Socially Responsible Machine Learning (TSRML) workshop at NeurIPS 2022 (New Orleans). In this paper, we discuss the framework and value of AI Governance for organizations of all sizes, across all industries a...
Read moreThree Keys to Ethical Artificial Intelligence in Your Organization
There’s certainly been no shortage of examples of AI gone bad over the past few years–enough to give everyone pause on how (and if) this technology can truly be used for good. If it’s not Facebook selling data of its users , it’s self-driving cars from Uber that can’t recognize pedestrians in time to slow down or stop. So while the uses ...
Read moreMake with H2O.ai Recap: Validation Scheme Best Practices
Data Scientist and Kaggle Grandmaster, Dmitry Gordeev, presented at the Make with H2O.ai session on validation scheme best practices, our second accuracy masterclass. The session covered key concepts, different validation methods, data leaks, practical examples, and validation and ensembling. Key Concepts While the validation topics cove...
Read moreMake with H2O.ai Recap: Getting Started with H2O Document AI
Product Owner, Data Scientist, and Kaggle Grandmaster, Mark Landry presented at the Make with H2O.ai session on getting started with H2O Document AI. The session covered an overview of H2O Document AI , a tool to extract insights and automate document processing. The session also included a product demo, looking at documents as data sets...
Read moreImproving Manufacturing Quality with H2O.ai and Snowflake
Manufacturers are rapidly expanding their machine learning use cases by leveraging the deep integration between Snowflake’s Data Cloud and the H2O AI Cloud. Many current manufacturing quality checks require that sensor data and image data be processed and analyzed separately. Standard tooling presents challenges in storing and referencin...
Read moreData Science with H2O.ai: An Introduction to Machine Learning and Predictive Modeling
Our own Jonathan Farland recently recorded a talk about machine learning and predictive modeling. In his talk, Jon also gave an overview of open source H2O and H2O AI Cloud . This video is a great resource for getting up to speed with the latest technology from H2O in half an hour. Some of you may prefer to go through the slides while l...
Read moreTackling Illegal, Unreported, and Unregulated (IUU) Fishing with AI
According to a report by the High-Level Panel for a Sustainable Ocean Economy, it is estimated that illegal, unreported, and unregulated (IUU) fishing accounts for 20 percent of the seafood and up to 50 percent in some areas. These activities not only affect the marine ecosystem but, in a way, are linked to climate change on the planet a...
Read moreUnsupervised Learning Metrics
That which is measured improves – Karl Pearson , Mathematician. Almost everyone has heard of accuracy, precision, and recall – the most common metrics for supervised learning . But not as many people know the metrics for unsupervised learning . So, in this article, we will take you through the most common methods and how to implement th...
Read moreA Quick Introduction to PyTorch: Using Deep Learning for Stock Price Prediction
Torch is a scalable and efficient deep learning framework. It offers flexibility and speed to build large scale applications. It also includes a wide range of libraries for developing speech, image, and video-based applications. The basic building block of Torch is called a tensor. All the operations defined in Torch use a tensor. Ok, l...
Read moreIntroducing H2O Hydrogen Torch: A No-code Deep Learning Framework
Over and over again we heard from customers, “deep learning is cool, but it’s hard and time consuming.” They kept asking “could someone just make it easier?” In typical “Maker” fashion, you ask, we deliver, H2O Hydrogen Torch . H2O Hydrogen Torch is a new product that enables data scientists and developers to train and deploy state-of-t...
Read moreAn Introduction to Unsupervised Machine Learning
There are three major branches of machine learning (ML): supervised, unsupervised, and reinforcement. Supervised learning makes up the bulk of the models businesses use, and reinforcement learning is behind front-page-news-AI such as AlphaGo . We believe unsupervised learning is the unsung hero of the three, and in this article, we brea...
Read moreMLB Player Digital Engagement Forecasting
Are you a baseball fan? If so, you may notice that things are heating up right now as the Major League Baseball (MLB ) World Series between Houston Astros and Atlanta Braves tied at 1-1.MLB Postseason 2021 Results as of October 28 (source) This also reminded me of the MLB Player Digital Engagement Forecasting competition in which my coll...
Read moreImproving NLP Model Performance with Context-Aware Feature Extraction
I would like to share with you a simple yet very effective trick to improve feature engineering for text analytics. After reading this article, you will be able to follow the exact steps and try it yourself using our H2O AI Cloud .First of all, let’s have a look at the off-the-shelf natural language processing (NLP) recipes in H2O Driver...
Read moreAI-Driven Predictive Maintenance with H2O AI Cloud
According to a study conducted by Wall Street Journal , unplanned downtime costs industrial manufacturers an estimated $50 billion annually. Forty-two percent of this unplanned downtime can be attributed to equipment failure alone. These downtimes can cause unnecessary delays and, as a result, affect the business. A better and superior al...
Read moreHow Much is My Property Worth?
Note : this is a guest blog post by Jaafar Almusaad .How Much is My Property Worth?This is the million-dollar question – both figuratively and literally. Traditionally, qualified property valuers are tasked to answer this question. It’s a lengthy and costly process, but more critically, it’s inconsistent and largely subjective. Mind you, ...
Read moreWhat it takes to become a World No 1 on Kaggle
In conversation with Guanshuo Xu: A Data Scientist, Kaggle Competitions Grandmaster, and a Ph.D. in Electrical Engineering. In this series of interviews, I present the stories of established Data Scientists and Kaggle Grandmasters at H2O.ai , who share their journey, inspirations, and accomplishments. The intention behind these interviews...
Read moreUnwrap Deep Neural Networks Using H2O Wave and Aletheia for Interpretability and Diagnostics
The use cases and the impact of machine learning can be observed clearly in almost every industry and in applications such as drug discovery and patient data analysis, fraud detection, customer engagement, and workflow optimization. The impact of leveraging AI is clear and understood by the business; however, AI systems are also seen as b...
Read moreShapley summary plots: the latest addition to the H2O.ai’s Explainability arsenal
It is impossible to deploy successful AI models without taking into account or analyzing the risk element involved. Model overfitting, perpetuating historical human bias, and data drift are some of the concerns that need to be taken care of before putting the models into production. At H2O.ai, explainability is an integral part of our ML ...
Read moreSafer Sailing with AI
In the last week, the world watched as responders tried to free a cargo ship that had gone aground in the Suez Canal. This incident blocked traffic through a waterway that is critical for commerce. While the location was an unusual one, ship collisions, allisions , and groundings are not uncommon. With all the technology that mariners hav...
Read moreH2O AI Cloud: Democratizing AI for Every Person and Every Organization
Harnessing AI’s true potential by enabling every employee, customer, and citizen with sophisticated AI technology and easy-to-use AI applications. Democratization is an essential step in the development of AI, and AutoML technologies lie at the heart of it. AutoML tools have played a pivotal role in transforming the way we consume an...
Read moreNew Improvements in H2O 3.32.0.2
There is a new minor release of H2O that introduces two useful improvements to our XGBoost integration: interaction constraints and feature interactions.Interaction ConstraintsFeature interaction constraints allow users to decide which variables are allowed to interact and which are not.Potential benefits: Better predictive performance...
Read moreGrandmaster Series: The inspiring journey of the ‘Beluga’ of Kaggle World 🐋
In conversation with Gábor Fodor: A Data Scientist at H2O.ai and a Kaggle Competitions’ Grandmaster. In this series of interviews, I present the stories of established Data Scientists and Kaggle Grandmasters at H2O.ai , who share their journey, inspirations, and accomplishments. These interviews are intended to motivate and encourage othe...
Read moreMitos e verdades sobre o AutoML
Todas as revoluções que tivemos até hoje, tanto as tecnológicas quanto industriais, possuem uma semelhança: elas estão ligadas à forma como os seres humanos lidam com as máquinas. Antes, os processos eram feitos de forma muito manual e, com o tempo, acabaram sofrendo uma evolução natural voltada para a automação. Com o aprendizado de máqu...
Read moreMaximizing your Value from AI
Some organizations have already identified the benefits that can be gained from Artificial Intelligence and Data Science, bringing in talented resources to enable them to build AI models and solutions. But more often than not, the business doesn’t understand the capabilities and huge potential of AI well enough, nor the investments that a...
Read moreThe Importance of Explainable AI
This blog post was written by Nick Patience, Co-Founder & Research Director, AI Applications & Platforms at 451 Research, a part of S&P Global Market Intelligence From its inception in the mid-twentieth century, AI technology has come a long way. What was once purely the topic of science fiction and academic discussion is now...
Read moreBuilding an AI Aware Organization
Responsible AI is paramount when we think about models that impact humans, either directly or indirectly. All the models that are making decisions about people, be that about creditworthiness, insurance claims, HR functions, and even self-driving cars, have a huge impact on humans. We recently hosted James Orton, Parul Pandey, and Sudala...
Read moreMaking AI a Reality
This blog post focuses on the content discussed in more depth in the free ebook “ Practical Advice for Making AI Part of Your Company’s Future”. Do you want to make AI a part of your company? You can’t just mandate AI. But you can lead by example.All too often, especially in companies new to AI and machine learning, team leaders may be ta...
Read moreThe Challenges and Benefits of AutoML
Machine Learning and Artificial Intelligence have revolutionized how organizations are utilizing their data. AutoML or Automatic Machine Learning automates and improves the end-to-end data science process. This includes everything from cleaning the data, engineering features, tuning the model, explaining the model, and deploying it into p...
Read moreEmpowering Snowflake Users with AI using SQL
At H2O.ai we work with many enterprise customers, all the way from Fortune 500 giants to small startups. What we heard from all these customers as they embark on their data science and machine learning journey is the need to capture and manage more data cost-effectively, and the ability to share that data across their organization to mak...
Read more3 Ways to Ensure Responsible AI Tools are Effective
Since we began our journey making tools for explainable AI (XAI) in late 2016, we’ve learned many lessons, and often the hard way. Through headlines, we’ve seen others grapple with the difficulties of deploying AI systems too. Whether it’s: a healthcare resource allocation system that likely discriminated against millions of black peop...
Read moreAccelerating AI Transformation in Healthcare
The healthcare industry is evolving rapidly with volumes of data and increasing challenges. Early adopters of AI and machine learning in the healthcare space have embraced new data-driven initiatives and are reaping the benefits not only in terms of patient care but also in their own operations. Hospitals, physicians, and laboratories can...
Read more5 Key Considerations for Machine Learning in Fair Lending
This month, we hosted a virtual panel with industry leaders and explainable AI experts from Discover, BLDS, and H2O.ai to discuss the considerations in using machine learning to expand access to credit fairly and transparently and the challenges of governance and regulatory compliance. The event was moderated by Sri Ambati, Founder and CE...
Read moreThe Benefits of Budget Allocation with AI-driven Marketing Mix Models
Excerpt of the white paper: “The Latest in AI Technologies Reinvent Media and Marketing Analytics @ Allergan” Authors: Akhil Sood, Associate Director @ Marketing Sciences, Allergan Dr. Michael Proksch, Senior Director @ H2o.ai Vijay Raghavan, Associate Vice President @ Marketing Sciences, AllerganIntroductionThe call for accountability in...
Read moreModèles NLP avec BERT
H2O Driverless AI 1.9 vient de sortir, et je vous propose une série d’articles sur les dernières fonctionnalités innovantes de cette solution d’Automated Machine Learning, en commençant par l’implémentation de BERT pour les tâches NLPBERT , ou “Bidirectional Encoder Representations from Transformers” est considéré aujourd’hui comme l’éta...
Read moreIn a World Where… AI is an Everyday Part of Business
Imagine a dramatically deep voice-over saying “In a world where…” This phrase from old movie trailers conjures up all sorts of futuristic settings, from an alien “world where the sun burns cold”, a Mad Max “world without gas” to a cyborg “world of the not too distant future”.Often the epic science fiction or futuristic stories also have a...
Read moreFrom GLM to GBM – Part 2
How an Economics Nobel Prize could revolutionize insurance and lending Part 2: The Business Value of a Better ModelIntroductionIn Part 1 , we proposed better revenue and managing regulatory requirements with machine learning (ML). We made the first part of the argument by showing how gradient boosting machines (GBM), a type of ML, can mat...
Read moreFrom GLM to GBM - Part 1
How an Economics Nobel Prize could revolutionize insurance and lending Part 1: A New Solution to an Old ProblemIntroductionInsurance and credit lending are highly regulated industries that have relied heavily on mathematical modeling for decades. In order to provide explainable results for their models, data scientists and statisticians i...
Read moreAre All Your AI and ML Models Wrong?
We are living in unprecedented times. Our society and economy are experiencing shocks beyond anything we have seen in living history. Beyond the human cost, there is a data science and machine learning elephant in the room (hopefully 2 meters away): Are your predictive models still doing the job you expect them to do?The challenge here i...
Read moreBrief Perspective on Key Terms and Ideas in Responsible AI
INTRODUCTIONAs fields like explainable AI and ethical AI have continued to develop in academia and industry, we have seen a litany of new methodologies that can be applied to improve our ability to trust and understand our machine learning and deep learning models. As a result of this, we’ve seen several buzzwords emerge. In this short po...
Read moreThree Ways Data and AI is Helping Against COVID19
We are in the midst of a global crisis that epidemiologists have warned us about. As of today, 180 countries and sovereign regions have confirmed cases of patients infected with COVID19 (from here ). Putting aside evidence that indicates the virulence of the disease could be much worse, the fast spread of the virus and the presence of hi...
Read moreDeploying Models to Maximise the Impact of Machine Learning — Part 1
Introduction to the 4 key pillars of considerations for model deployment (1st part of a blog series)So you have built a machine learning (ML) model which delivers a high level of accuracy and does not overfit. What value does it have now? Well, at the moment, nothing, zero, diddly squat. There is no economic value in a machine learning mo...
Read moreCOVID-19: Doing Good with Data + AI
During times of severe societal strain, individuals have historically shown an inclination to offer aid and assistance. Often these sacrifices have been at great cost to life or livelihood. In other cases, the efforts have been seemingly more mundane but nevertheless still essential. The efforts of the over 10,000 women code breakers of W...
Read moreSummary of a Responsible Machine Learning Workflow
A paper resulting from a collaboration between H2O.AI and BLDS, LLC was recently published in a special “Machine Learning with Python” issue of the journal, Information (https://www.mdpi.com/2078-2489/11/3/137). In “A Responsible Machine Learning Workflow with Focus on Interpretable Models, Post-hoc Explanation, and Discrimination Testing...
Read moreIt is a privilege to serve the world in its hour of need – H2O.ai response to the COVID-19 pandemic
During the COVID-19 pandemic, our world, our nations, states, counties, cities and communities face an unprecedented challenge with an urgent need to help our citizens and ultimately our national and global economy. At highest risk are senior citizens, at-risk populations (individuals with immunodeficiency, hypertension, diabetes) and our...
Read moreHealth Outcomes and the Miracle of Data
In 1846, a physician named Ignatz Semmelweis, located at the Allgemeine Krankenhaus in Vienna, faced a dire healthcare crisis. He observed that the maternity ward in his own hospital (as well as those in other area hospitals) had a maternal mortality rate of over 15%. That is, one out of every six mothers who came to his hospital to give ...
Read moreInsights From the New 2020 Gartner Magic Quadrant For Cloud AI Developer Services
We are excited to be named a Visionary in the new Gartner Magic Quadrant for Cloud AI Developer Services (Feb 2020), and have been recognized for both our completeness of vision and ability to execute in the emerging market for cloud-hosted artificial intelligence (AI) services for application developers. This is the second Gartner MQ tha...
Read moreAI & ML Platforms: My Fresh Look at H2O.ai Technology
2020: A new year, a new decade, and with that, I’m taking a new and deeper look at the technology H2O.ai offers for building AI and machine learning systems. I’ve been interested in H2O.ai since its early days as a company (it was 0xdata back then) in 2014. My involvement had been only peripheral, but now I’ve begun to work with this comp...
Read moreInterview with Patrick Hall | Machine Learning, H2O.ai & Machine Learning Interpretability
Audio Link: In this episode of Chai Time Data Science , Sanyam Bhutani interviews Patrick Hall, Sr. Director of Product at H2O.ai. Patrick has a background in Math and has completed a MS Course in Analytics.In this interview they talk all about Patrick’s journey into ML, ML Interpretability and his journey at H2O.ai, how his work has ev...
Read moreKey Takeaways from the 2020 Gartner Magic Quadrant for Data Science and Machine Learning
We are named a Visionary in the Gartner Magic Quadrant for Data Science and Machine Learning Platforms (Feb 2020). We have been positioned furthest to the right for completeness of vision among all the vendors evaluated in the quadrant. So let’s walk you through the key strengths of our machine learning platforms. Automatic Machine Learn...
Read moreBlink: Data to AI/ML Production Pipeline Code in Just a Few Clicks
You have the data and now want to build a really really good AI/ML model and deliver to production. There are three options available today: Write the code yourself in a Jupyter notebook/R Studio etc., for training/validation and dev-ops model handoff. You decided to do the feature engineering also. Build your own features like above,...
Read moreParallel Grid Search in H2O
H2O-3 is, at its core, a platform for distributed, in-memory computing. On top of the distributed computation platform, the machine learning algorithms are implemented. At H2O.ai, we design every operation, be it data transformation, training of machine learning models or even parsing to utilize the distributed computation model. In ord...
Read moreThe Super Bowl and Data Science: Changing the NFL with the Power of Machine Learning
Super Bowl LIV came and went. The San Francisco 49ers vs the Kansas City Chiefs. Personally, being from the The Bay, I was rooting for the 49ers, but you can’t always get what you want. Whoever came out on top, though, we were all looking forward to a great game full of fantastic plays and the kind of gridiron tenacity where players lay i...
Read moreGrandmaster Series: How a Passion for Numbers Turned This Mechanical Engineer into a Kaggle Grandmaster
In conversation with Sudalai Rajkumar: A Kaggle Double Grandmaster and a Data Scientist at H2O.aiIt is rightly said that one should never seek praise. Instead, let the effort speak for itself. One of the essential traits of successful people is to never brag about their success but instead keep learning along the way. In the data science ...
Read moreWhy you should care about debugging machine learning models
This blog post was originally published here. Authors: Patrick Hall and Andrew Burt For all the excitement about machine learning (ML), there are serious impediments to its widespread adoption. Not least is the broadening realization that ML models can fail. And that’s why model debugging, the art and science of understanding and fixing p...
Read moreHow to Effectively Employ an AI Strategy in your Business
Artificial Intelligence has evolved from being a buzz word to a reality today. Companies with expertise in machine learning systems are looking to graduate to Artificial Intelligence-based technologies. The enterprises that do not yet have a machine learning culture are trying to devise a strategy to put one in place. Amidst t...
Read moreScalable AutoML in H2O
Note: I’m grateful to Dr. Erin LeDell for the suggestions, corrections with the writeup. All of the images used here are from the talks’ slides. Erin Ledell’s talk was aimed at AutoML : Automated Machine Learning , broadly speaking, followed by an overview of H2O’s Open Source Project and the library. H2O AutoML provides an easy-to-use ...
Read moreClimbing the AI and ML Maturity Model Curve
AI/ML Maturity Model Curve/StepsAI/ML Maturity models are published and updated periodically by a lot of vendors. The end goal is almost always about effecting transformation and automate processes in a short period and making AI the DNA/core of the business.One of the biggest challenges for businesses today is to clearly define what succ...
Read moreHow to write a Transformer Recipe for Driverless AI
What is a transformer recipe? A transformer (or feature) recipe is a collection of programmatic steps, the same steps that a data scientist would write a code to build a column transformation. The recipe makes it possible to engineer the transformer in training and in production. The transformer recipe, and recipes, in general, provide a...
Read moreNovel Ways To Use Driverless AI
I am biased when I write that Driverless AI is amazing, but what’s more amazing is how I see customers using it. As a Sales Engineer, my job has been to help our customers and prospects use our flagship product. In return, they give us valuable feedback and talk about how they used it. Feedback is gold to us. Driverless AI has evolved in...
Read moreNatural Language Processing in H2O’s Driverless AI
Note: I’d like to thank Grandmaster SRK for a lot of suggestions and corrections with the writeup.Note: All images used here are from the talk. Link to the slides Link to the video Note 2: All of the discussion here is related to NLP. DriverlessAI also supports other domains that are covered in other talks and posts (releasing soon). Driv...
Read moreTakeaways from the World’s largest Kaggle Grandmaster Panel
Disclaimer: We were made aware by Kaggle of adversarial actions by one of the members of this panel. This panelist is no longer a Kaggle Grandmaster and no longer affiliated with H2O.ai as of January 10th, 2020. Personally, I’m a firm believer and fan of Kaggle and definitely look at it as the home of Data Science. ...
Read moreA Full-Time ML Role, 1 Million Blog Views, 10k Podcast Downloads: A Community Taught ML Engineer
Content originally posted in HackerNoon and Towards Data Science 15th of October, 2019 marks a special milestone, actually quite a few milestones. So I considered sharing it in the form a blog post, on a publication that has been home to all of my posts The online community has been too kind to me and these blog posts have been a method ...
Read moreMake your own AI — Add Your Game to Auto-ML Models
When Features and Algorithms compete, your Business Use Case(s) wins! H2O Driverless AI is an Automatic Feature Engineering /Machine Learning platform to build AI/ML models on tabular data. Driverless AI can build supervised learning models for Time Series forecasts, Regression , Classification , etc. It supports a myriad of built-i...
Read moreH2O Driverless AI Acceleration with Intel DAAL
This week at Strata NY 2019 we will be demoing a custom recipe that incorporates the Intel Data Analytics Acceleration Libraray (DAAL) algorithm into Driverless AI. This blog will provide an introduction to Intel DAAL and how the Make-Your-Own-Recipe capability extends H2O Driverless AI. If you are at Strata NY 2019, stop by the Intel bo...
Read moreFrom Academia to Kaggle and H2O.ai: How a Physicist found love in Data Science
Learning and taking inspirations from others is always helpful. It makes even more sense in the Data Science realm, which is continuously being bombarded with new courses, MOOCs, and recommendations with every passing day. Not only such a lot of choices become overwhelming but also perplexing at times. With this thought in mind, we bring...
Read morePredicting Failures from Sensor Data using AI/ML— Part 1
Last updated: 08/26/19 Whether it’s healthcare, manufacturing or anything that we depend on either personal or in business, Prevention of a problem is always known to be better than cure! Classic prevention techniques involve time-based checks to see how things are progressing, positively or negatively. Time-based chec...
Read moreNew Innovations in Driverless AI
What’s new in Driverless AIWe’re super excited to announce the latest release of H2O Driverless AI . This is a major release with a ton of new features and functionality. Let’s quickly dig into all of that: Make Your Own AI with Recipes for Every Use Case: In the last year, Driverless AI introduced time-series and NLP recipes to meet the...
Read moreA Maker Data Scientist’s journey: from Sudoku to Kaggle
If you put enough smart people together in one space, good things happen. Erik Hersman One of the perks of being a part of H2O.ai is that you get to work with some of the brightest minds on the planet. Here you get to closely engage with people who have a great deal of experience, as well as expertise. One such set of specialists here ar...
Read moreDetecting Sarcasm is difficult, but AI may have an answer
Recently, while shopping for a laptop bag, I stumbled upon a pretty amusing customer review: “This is the best laptop bag ever. It is so good that within two months of use, it is worthy of being used as a grocery bag.” The innate sarcasm in the review is evident as the user isn’t happy with the quality of the bag. However, as the sentence...
Read moreMitigating Bias in AI/ML Models with Disparate Impact Analysis
Everyone understands that the biggest plus of using AI/ML models is a better automation of day-to-day business decisions, personalized customer service, enhanced user experience, waste elimination, better ROI, etc. The common question that comes up often though is — How can we be sure that the AI/ML decisions are free from bias/discrimina...
Read moreCustom Machine Learning Recipes: The ingredients for success
Last updated: 07/23/19Machine learning is akin to cooking in several ways. A perfect dish originates from a tried-and-tested recipe, has the right combination of ingredients, and is baked at just the right temperature. Successful AI solutions work on the same principle. One needs fresh and right quality ingredients in the form of data, ...
Read moreToward AutoML for Regulated Industry with H2O Driverless AI
Predictive models in financial services must comply with a complex regime of regulations including the Equal Credit Opportunity Act (ECOA), the Fair Credit Reporting Act (FCRA), and the Federal Reserve’s S.R. 11-7 Guidance on Model Risk Management. Among many other requirements, these and other applicable regulations stipulate predictive ...
Read moreUnderwrite.ai Transforms Credit Risk Decision-Making Using AI
Determining credit has been done by traditional techniques for decades. The challenge with traditional credit underwriting is that it doesn’t take into account all of the various aspects or features of an individual’s credit ability. Underwrite.ai, a new credit startup, saw this as an opportunity to apply machine learning and AI to impro...
Read moreCan Your Machine Learning Model Be Hacked?!
I recently published a longer piece on security vulnerabilities and potential defenses for machine learning models. Here’s a synopsis.IntroductionToday it seems like there are about five major varieties of attacks against machine learning (ML) models and some general concerns and solutions of which to be aware. I’ll address them one-by-o...
Read moreH2O World Explainable Machine Learning Discussions Recap
Earlier this year, in the lead up to and during H2O World, I was lucky enough to moderate discussions around applications of explainable machine learning (ML) with industry-leading practitioners and thinkers. This post contains links to these discussions, written answers and pertinent resources for some of the most common questions asked ...
Read moreAI/ML Model Scoring - What Good Looks Like in Production
One of the main reasons why we build AI/Machine Learning models is for it to be used in production to support expert decision making. Whether your business is deciding what creatives your customers should be getting on emails or determining a product recommendation for a web page, AI/Models provide relevance/context to customers to drive ...
Read moreHow to explain a model with H2O Driverless AI
The ability to explain and trust the outcome of an AI-driven business decision is now a crucial aspect of the data science journey. There are many tools in the marketplace that claim to provide transparency and interpretability around machine learning models but how does one actually explain a model? H2O Driverless AI provides robust inte...
Read moreBoosting your ROI with AutoML & Automatic Feature Engineering
If your business has started using AI/ML tools or just started to think about it, this blog is for you. Whether you are a data scientist, VP of data science or a line of a business owner, you are probably wondering how AI will impact your organization in various ways or why your current strategies are not working somehow. If you are not ...
Read moreWhat is Your AI Thinking? Part 3
In the past two posts we’ve learned a little about interpretable machine learning in general. In this post, we will focus on how to accomplish interpretable machine learning using H2O Driverless AI . To review, the past two posts discussed: Exploratory data analysis (EDA) Accurate and interpretable models Global explanations Local...
Read moreKey Takeaways from the Gartner Magic Quadrant For Data Science & Machine Learning
The Gartner Magic Quadrant for Data Science and Machine Learning Platforms (Jan 2019) is out and H2O.ai has been named a Visionary. The Gartner MQ evaluates platforms that enable expert data scientists, citizen data scientists and application developers to create, deploy and manage their own advanced analytic models.H2O.ai Key Highlights...
Read moreWhat is Your AI Thinking? Part 2
Explaining AI to the Business PersonWelcome to part 2 of our blog series: What is Your AI Thinking? We will explore some of the most promising testing methods for enhancing trust in AI and machine learning models and systems. We will also cover the best practice of model documentation from a business and regulatory standpoint.More Techniq...
Read moreWhat is Your AI Thinking? Part 1
Explaining AI to the Business PersonExplainable AI is in the news, and for good reason. Financial services companies have cited the ability to explain AI-based decisions as one of the critical roadblocks to further adoption of AI for their industry . Moreover, interpretability, fairness, and transparency of data-driven decision support sy...
Read moreCelebrating our community and wins!
The last year was an amazing year at H2O.ai. We organized two H2O World’s, gathering thousands of attendees in person and online both in New York and London. Throughout the year, we garnered multiple industry awards and honors for AI and machine learning, but our customers received awards as well for the work they are doing with our techn...
Read moreHow This AI Tool Breathes New Life Into Data Science
Ask any data scientist in your workplace. Any Data Science Supervised Learning ML/AI project will go through many steps and iterations before it can be put in production. Starting with the question of “Are we solving for a regression or classification problem?” Data Collection & Curation Are there Outliers? What is the Distribu...
Read moreWhat does NVIDIA’s Rapids platform mean for the Data Science community?
Today NVIDIA announced the launch of the RAPIDS suite of software libraries to enables GPU acceleration for data science workflows and we’re excited to partner with NVIDIA to bring GPU accelerated open source technology for the machine learning and AI community. “Machine learning is transforming businesses and NVIDIA GPUs are speeding...
Read moreAutomatic Feature Engineering for Text Analytics - The Latest Addition to Our Kaggle Grandmasters' Recipes
According to Kaggle’s ‘The State of Machine Learning and Data Science ’ survey , text data is the second most used data type at work for data scientists. There are a lot of interesting text analytics applications like sentiment prediction, product categorization, document classification and so on. In the latest version (1.3) of our Driver...
Read moreH2O for Inexperienced Users
Some background: I am a rising senior in highschool, and the summer of 2018, I interned at H2O.ai. With no ML experience beyond Andrew Ng’s Introduction to Machine Learning course on Coursera and a couple of his deep learning courses, I initially found myself slightly overwhelmed by the variety of new algorithms H2O has to offer in both ...
Read moreInterpretability: The missing link between machine learning, healthcare, and the FDA?
Recent advances enable practitioners to break open machine learning’s “black box”.From machine learning algorithms guiding analytical tests in drug manufacture, to predictive models recommending courses of treatment, to sophisticated software that can read images better than doctors, machine learning has promised a new world of healthcar...
Read moreH2O-3 on FfDL: Bringing deep learning and machine learning closer together
This post originally appeared in the IBM Developer blog here. This post is co-authored by Animesh Singh, Nicholas Png, Tommy Li, and Vinod Iyengar. Deep learning frameworks like TensorFlow, PyTorch, Caffe, MXNet, and Chainer have reduced the effort and skills needed to train and use deep learning models. But for AI developers and data ...
Read moreHow to Frame Your Business Problem for Automatic Machine Learning
Over the last several years, machine learning has become an integral part of many organizations’ decision-making at various levels. With not enough data scientists to fill the increasing demand for data-driven business processes, H2O.ai has developed a product called Driverless AI that automates several time consuming aspects of a typica...
Read moreAI in Healthcare - Redefining Patient & Physician Experiences
Register for the Meetup Here Patients, physicians, nurses, health administrators and policymakers are beneficiaries of the rapid transformations in health and life sciences. These transformations are being driven by new discoveries (etiology, therapies, and drugs/implants), market reconfiguration and consolidation, a movement to value-bas...
Read moreDemocratize care with AI — AI to do AI for Healthcare
Very excited to have Prashant Natarajan (@natarpr) join us along with Sanjay Joshi on our vision to change the world of healthcare with AI. Health is wealth. And one worth saving the most. They bring invaluable domain knowledge and context to our cause. As one of our customers would like to say, Healthcare should be optimized for health...
Read moreNew features in H2O 3.18
Wolpert Release (H2O 3.18)There’s a new major release of H2O and it’s packed with new features and fixes! We named this release after David Wolpert , who is famous for inventing Stacking (aka Stacked Ensembles ). Stacking is a central component in H2O AutoML , so we’re very grateful for his contributions to machine learning! He is also fa...
Read moreDeveloping and Operationalizing H2O.ai Models with Azure
This post originally appeared here. It was authored by Daisy Deng, Software Engineer, and Abhinav Mithal, Senior Engineering Manager, at Microsoft. The focus on machine learning and artificial intelligence has soared over the past few years, even as fast, scalable and reliable ML and AI solutions are increasingly viewed as being vital to...
Read moreHappy Holidays from H2O.ai
Dear Community, Your intelligence, support and love have been the strength behind an incredible year of growth, product innovation, partnerships, investments and customer wins for H2O and AI in 2017. Thank you for answering our rallying call to democratize AI with our maker culture. Our mission to make AI ubiquitous is still fresh as da...
Read moreH2O.ai Raises $40 Million to Democratize Artificial Intelligence for the Enterprise
November 30, 2017 | Data Science, Machine Learning | H2O.ai Raises $40 Million to Democratize Artificial Intelligence for the Enterprise
Read moreH2O.ai Releases H2O4GPU, the Fastest Collection of GPU Algorithms on the Market, to Expedite Machine Learning in Python
H2O4GPU is an open-source collection of GPU solvers created by H2O.ai. It builds on the easy-to-use scikit-learn Python API and its well-tested CPU-based algorithms. It can be used as a drop-in replacement for scikit-learn with support for GPUs on selected (and ever-growing) algorithms. H2O4GPU inherits all the existing scikit-learn algor...
Read moreXGBoost in the H2O Machine Learning Platform
The new H2O release 3.10.5.1 brings a shiny new feature – integration of the powerful XGBoost library algorithm into H2O Machine Learning Platform! XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible, and portable. XGBoost provides parallel tree boosting (also known as GBDT, GBM) that ...
Read moreStacked Ensembles and Word2Vec now available in H2O!
Prepared by: Erin LeDell and Navdeep Gill MathJax.Hub.Config({ tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]} }); Stacked Ensembles ensemble <- h2o.stackedEnsemble(x = x, y = y, training_frame = train, base_models = my_models) Python:ensemble = H2OStackedEnsembleEstimator(base_models=my_models) ensemble.train(x=x, y=y, training...
Read moreCompressing Zip Codes with Generalized Low Rank Models
This tutorial introduces the Generalized Low Rank Model (GLRM) [1 ], a new machine learning approach for reconstructing missing values and identifying important features in heterogeneous data. It demonstrates how to build a GLRM in H2O that condenses categorical information into a numeric representation, which can then be used in other mo...
Read more