September 27th, 2019

Predicting Failures from Sensor Data using AI/ML — Part 2

RSS icon RSS Category: H2O Driverless AI, Recipes, Technical

This is Part 2 of the blog post series and continuation of the original post, Predicting Failures from Sensor Data using AI/ML — Part 1.

Missing Values & Data Imbalance

One of the things to note is that the hard-disk data set has a lot of missing values across its columns. Check out the Missing Data Heat Map on the training data set — Derived from Auto-Viz in Driverless AI. From the picture below, one can tell that a majority of sensor data is missing or incomplete – the red color in the aggregated chart indicates missing data. Where it’s incomplete, one can easily guess, it might be that not all hard-disk vendors agree to generate sensor data for a S.M.A.R.T sensor variable.

When I tried to build a base AI/ML model in Driverless AI, I got a notification that it automatically dropped these 19 columns because of empty or constant values.

I also got this notification:

Driverless AI 1.7.x does sampling by default for imbalanced data for every iteration to achieve good overall accuracy.

IID Model or Time Series AI/ML Model?

We can build an IID (Independent and Identically Distributed) model and treat all the rows as independent. We can also treat the data as time series, as sensor data is available for each model/serial # every day. I will build a Time Series Classification model in this blog. Since the data is highly imbalanced and it’s a binary classification experiment, I’ll use LogLoss as the scorer to optimize the model. We can always check AUC, AUCPR, etc., once it’s done.

I set a high value of 8/8 for Accuracy (Model tuning effort) and Time (>> Iterations/Early stopping only after 20 iterations) and set Interpretability to 5 (Medium Feature Engineering). Driverless AI can automatically do the shift detection and drop overfitting columns automatically using some AUC threshold – but I disabled the shift detection in Expert Settings.

What is Shift Detection? Detecting distribution shifts of variables between training and test sets can avoid overfitting in models. Driverless AI by default does this to prevent overfitting. It’s an optional feature, and you can always turn it off, like I did here.

I set the Target Column to “failure”, the time column to “date”, and the forecast horizon to “7 days”. The assumption is that the data center can find the spare drive and get it ready to replace it within 7 days, before the failure happens. You can also change this for 1,15, 30 days etc ., – whatever days you want to forecast that suits the predictive maintenance task at hand.

The plan says that from 67 columns, it is going to build 4K features with 8 features being picked for the final pipeline after 208 iterations of model tuning/feature engineering. There is more on the screen that you can read.

BYOR Transformers

Driverless AI allows users to add custom feature engineering or models to the evolutionary model/feature finding process. It’s using a feature called “Bring Your Own Recipe” or BYOR. I uploaded the following feature recipes from this GitHub location, so the experiment can try using it and see if it adds value in feature transformations (besides the default ones):

Final Results with Time Series Model

After running for several hours on a single GPU box, Driverless AI built 2.4K models on 4K features and shows you the final result!

While the AUC on the training/validation is 0.9713, the AUCPR looks much more reasonable with a 0.26 score, given the 1:1000 imbalance in the training data set. This 0.26 value is for the micro-averaged value across cross-validation results on the training data set.

Clearly the BYOR recipes such as FRST[N]CHARCVTE come out on top at different positions. The initial characters of the hard-disk model name are a great feature that gives us good predictability, it would seem. The default feature engineering in Driverless AI, such as Cross ValidatedTarget Encoding, Numeric to Categorical Target Encoding, is useful in the prediction.

We can also see that the original columns, such as:

  • smrt_241_totl_lbas_read
  • smrt_193_rprtd_uncrrctble_sctr_cnt
  • smrt_187_rprtd_uncrrctble_errs
  • smrt_5_realloc_sector_cnt
  • smrt_12_power_cycl_cnt
  • smrt_7_seek_error_rate

etc., are either appearing on their own or getting feature engineered in interesting ways to create derived features to maximize the prediction score. It’s not surprising that the logical blocks read from hard-disk and the errors reported by the smart sensor are correlated to a hard-disk failure!

So How Accurate Was the Model When It Predicted the Test Data Set?

Even though the training AUCPR was pretty impressive, the test set confusion matrix is a bit more realistic on what you can expect in production deployment.

I didn’t spend a lot of time (but plan to in near future) here — I probably did some 10–15 models and arbitrarily split the training/test without giving it much thought. You can, however, see for the high data imbalance of 1:1000 failures in the training data set, we are predicting 8 out of a total of 31 hard-disk failures in the test data set, which is roughly 25% failures, with a ~ 50x false positives. The above # is based on a threshold value of prediction probability. Changing that can get us the desired tradeoff between false positives/false negatives, etc. It’s also interesting to note that we are predicting 99.99% correctly on rows that don’t indicate failures correctly, which is not surprising how a majority class generally dominates in predictability with imbalanced data.

What I Did Not Do Yet:

There are a lot of Expert Settings options in Driverless AI that give us control over Imbalanced Sampling, how to generate Hold Out Predictions, adding more algorithms, vendor-specific feature engineering recipes, etc. I forgot to remove the Date Transformers – the model might overfit on that, as the day of the week, etc., is showing up in feature importance currently. So the model can be definitely improved over time with additional tweaks  —  the goal being to reduce false negatives first with an additional goal of lowering false positives.

Citation: The Hard-disk failure data used in this blog post and the previous one is from


About the Author

Karthik Guruswamy

Karthik is a Principal Pre-sales Solutions Architect with H2O. In his role, Karthik works with customers to define, architect and deploy H2O’s AI solutions in production to bring AI/ML initiatives to fruition.

Karthik is a “business first” data scientist. His expertise and passion have always been around building game-changing solutions - by using an eclectic combination of algorithms, drawn from different domains. He has published 50+ blogs on “all things data science” in Linked-in, Forbes and Medium publishing platforms over the years for the business audience and speaks in vendor data science conferences. He also holds multiple patents around Desktop Virtualization, Ad networks and was a co-founding member of two startups in silicon valley.

Leave a Reply

H2O LLM DataStudio Part II: Convert Documents to QA Pairs for fine tuning of LLMs

Convert unstructured datasets to Question-answer pairs required for LLM fine-tuning and other downstream tasks with

September 22, 2023 - by Genevieve Richards, Tarique Hussain and Shivam Bansal
Building a Fraud Detection Model with H2O AI Cloud

In a previous article[1], we discussed how machine learning could be harnessed to mitigate fraud.

July 28, 2023 - by Asghar Ghorbani
A Look at the UniformRobust Method for Histogram Type

Tree-based algorithms, especially Gradient Boosting Machines (GBM's), are one of the most popular algorithms used.

July 25, 2023 - by Hannah Tillman and Megan Kurka
H2O LLM EvalGPT: A Comprehensive Tool for Evaluating Large Language Models

In an era where Large Language Models (LLMs) are rapidly gaining traction for diverse applications,

July 19, 2023 - by Srinivas Neppalli, Abhay Singhal and Michal Malohlava
Testing Large Language Model (LLM) Vulnerabilities Using Adversarial Attacks

Adversarial analysis seeks to explain a machine learning model by understanding locally what changes need

July 19, 2023 - by Kim Montgomery, Pramit Choudhary and Michal Malohlava
Reducing False Positives in Financial Transactions with AutoML

In an increasingly digital world, combating financial fraud is a high-stakes game. However, the systems

July 14, 2023 - by Asghar Ghorbani

Ready to see the platform in action?

Make data and AI deliver meaningful and significant value to your organization with our state-of-the-art AI platform.