February 5th, 2021

Data to Production Ready Models to Business Apps in Just a Few Steps

RSS icon RSS Category: Use Cases, Wave

Building a Credit Scoring Model and Business App using H2O

In the journey of a successful credit scoring implementation, multiple stakeholders and different personas are involved at different steps – Business Inputs, Dataset procurement, Data Analysis, Predictive Machine Learning, Data Storytelling, and Dashboarding. H2O.AI platforms such as DriverlessAI and H2O Wave help in automating a lot of these steps in the overall lifecycle of the project. In this article, we will look at how to use these platforms to develop high quality machine learning models for credit scoring and a business facing applications to consume the results.

We will use a public dataset for this article obtained from Kaggle, it contains credit details of about 250,000 borrowers along with the target column – if they managed to repay the loan or not. The goal is to use this dataset and train a machine model that banks can use to predict the risk score of the customers. Following table shows the column descriptions.

H2O Driverless AI is an Automatic Machine Learning platform that can be used to build trustable, transparent, and production ready machine learning Models for a variety of data science problems such as – forecasting, regression, classification, nlp etc. It is pre-configured with 150+ recipes (algorithms and techniques) for models, transformations, and scoring. In addition to these, one can also bring their own algorithms, feature engineering code to enhance the model building process.

To develop a newCredit Scoring Model, we will first add the dataset in Driverless AI which can be added via file upload (or any data connector) in the Datasets tab. We start a new Driverless AI experiment by clicking Predict on the dataset and define the target column as IsBadCredit. There are a few other optional parameters which can be changed, example the three knobs in the bottom suggests Accuracy, Time, Interpretability. These knobs can be fine tuned in an iterative manner to change the modelling, feature engineering strategies. Finally, we click the LAUNCH EXPERIMENT button to start the auto machine learning experiment.

In about 6 mins, Driverless AI had already created 20 models on 130 features and one can also observe the AUC score on the validation dataset (as it performs cross-validation internally). Most of the features are generated automatically by Driverless AI, using an evolutionary technique inspired by genetic algorithms.

After about 15 mins, the experiment finished with 0.8662 AUC by evolving both the algorithm hyperparameters (tuning) + engineered features.

Once the experiment completes, Driverless AI provides several valuable artifacts such as – Experiment Auto Report Documentation, Machine Learning Explainability, and Model Scoring Pipeline.

One can fine tune this experiment with more tweaks and customizations in an iterative manner, however if it is acceptable to the banks, one can now use this model and can generate the prediction scores for any new data. Many business users prefer an interface to consume the results with summary statistics and visualizations. In fact, for a use case like this where multiple users / personas are involved, there is a need to create an app. Let’s now look at how we can leverage H2O Wave to build a business ready application to display the model predictions on a new dataset along with a few visualizations.

H2O Wave is an open-source Python development framework that makes it fast and easy for users to develop real-time interactive AI apps with sophisticated visualizations. H2O Wave accelerates development with a wide variety of user-interface components and charts, including dashboard templates, dialogs, themes, widgets, and many more.

Wave is a low code framework, which allows to build applications with minimal python code – without any need of HTML, CSS, or JavaScript. Let’s create the Credit Scoring App which uses the same model we trained in driverless ai. First we create a folder named – CreditScoreApp and create a python file named app.py.

Let’s now add different cards to make the app layout. In the homepage, we will add header, sidebar, and content.

def add_header_card(box):
    return ui.header_card(box=box, 
                          icon='UserFollowed', 
                          icon_color='Yellow',
                          title="Credit Scoring App",
                          subtitle="Generate Credit Score Predictions using Driverless AI" )

def add_sidebar_card(box, customer_ids):
    id_choices = [ui.choice(_, _) for _ in customer_ids]
    return ui.form_card(box=box,
                        items = [ui.text_xl(content='Select Customer Record'),
                                 ui.dropdown(name='customer_id', label='ID', 
                                             choices=id_choices),
                                 ui.button(name='predict', label='Generate', 
                                           primary=True)])

The updated code of show_homepage will look like this:

Notice the box parameter given as the first input of every function call. Box defines the size and location of a card, it uses the following format: Column Row Width Height.  The code we added so far will generate the following interface with three cards.

Now let’s start adding the cards in the Content Card to make a business dashboard. We will add multiple cards in different rows. Following shows the structure of the dashboard content. We add following sub-cards in this dashboard:

H2O Wave provides native Visualizations and many pre-built templates to display stat cards, gauge cards, plots etc.

def add_bar_chart(box, title, plot_type='interval'):
    return ui.plot_card(box=box, title=title, data=data('xvalue yvalue'),
       plot=ui.plot([ui.mark(type=plot_type, x='=xvalue', y='=yvalue', color='=yvalue')]))

Following will generate a Credit Scoring App which can be accessed by different users to get the credit scores of different customers.

Integrating DriverlessAI and H2O Wave

At the start of the app, we want to take our test dataset and make predictions using the trained driverlessai experiment. For this task, we will use Python’s Client. Following is the snippet that I used to integrate both of these platforms to make the predictions.

import driverlessai
class DriverlessPredict:
	def __init__(self, config):
		self.dai, self.exp = self.dai_connect(config)

	def dai_connect(self, config):
		dai = driverlessai.Client(address = config['address'], 
					   username = config['username'], 
					   password = config['password'])
		exp = dai.experiments.get(config['experiment_key'])
		return dai, exp

	def dai_predict(self, input_path):
		dai_table = self.dai.datasets.create(input_path, force=True)
		pred_path = self.exp.predict(dai_table).download('datasets/', 
		                                                 overwrite=True)
		return pd.read_csv(pred_path)

To make the predictions on the test dataset, we first connect to driverless ai instances by providing username, password, and url address. Then we pass the dataset to make predictions in the dai_predict functions. This returns a new dataframe with an extra column – predictions.

Summary

In this article, we looked at how we can create a machine learning model in Driverless AI. We then used the model to create a business facing application using H2O Wave. The source code for this application is hosted here.

About the Author

Shivam Bansal

Shivam is the 3x Kaggle Grandmaster, 5 times winner of Kaggle’s Analytics / Data Science for Good Competition, and the winner of several other offline and online competitions. He holds a master's degree from the National University of Singapore and was a Valedictorian. He has extensive cross-industry and hands-on experience in building data science products and applications. He brings a strong blend of technical and business skills with a practical and solution-driven approach. He supports various functions within the company which include - engineering, pre-sales, and customer success. His LinkedIn profile can be found here.

Leave a Reply

+
Enhancing H2O Model Validation App with h2oGPT Integration

As machine learning practitioners, we’re always on the lookout for innovative ways to streamline and

May 17, 2023 - by Parul Pandey
+
Building a Manufacturing Product Defect Classification Model and Application using H2O Hydrogen Torch, H2O MLOps, and H2O Wave

Primary Authors: Nishaanthini Gnanavel and Genevieve Richards Effective product quality control is of utmost importance in

May 15, 2023 - by Shivam Bansal
AI for Good hackathon
+
Insights from AI for Good Hackathon: Using Machine Learning to Tackle Pollution

At H2O.ai, we believe technology can be a force for good, and we're committed to

May 10, 2023 - by Parul Pandey and Shivam Bansal
H2O democratizing LLMs
+
Democratization of LLMs

Every organization needs to own its GPT as simply as we need to own our

May 8, 2023 - by Sri Ambati
h2oGPT blog header
+
Building the World’s Best Open-Source Large Language Model: H2O.ai’s Journey

At H2O.ai, we pride ourselves on developing world-class Machine Learning, Deep Learning, and AI platforms.

May 3, 2023 - by Arno Candel
LLM blog header
+
Effortless Fine-Tuning of Large Language Models with Open-Source H2O LLM Studio

While the pace at which Large Language Models (LLMs) have been driving breakthroughs is remarkable,

May 1, 2023 - by Parul Pandey

Request a Demo

Explore how to Make, Operate and Innovate with the H2O AI Cloud today

Learn More