October 7th, 2021

Feature Transformation with the H2O AI Cloud

RSS icon RSS Category: H2O AI Cloud

It is well known throughout the data science community that data preparation, pre-processing, and feature engineering are one of the most cumbersome parts of the data science workload. So as we continue to innovate here at H2O.ai with our end-to-end automated machine learning (autoML) capabilities, we challenged ourselves to evolve the process of feature engineering into more robust feature transformation.

The H2O AI Cloud enables exploratory data analysis capabilities through automated visualizations and insights. This core functionality allows data scientists to see feature information, interactions, charts and plots to rapidly understand the dataset they are working with. This accelerates the discovery process for data scientists in finding signal driving features and interactions hidden in their data. The H2O AI Cloud enables these dynamic interactions and feature transformations to be extracted automatically, while simultaneously giving users the ability to turn anything off or on at their choosing, enabling full control of the evolutionary algorithm driving data intelligence.

H2O.ai’s proprietary evolutionary algorithm, which is used to quantitatively experiment and test hundreds of combinations and feature transformations to find signals in the noise is running underneath the surface of H2O.ai’s autoML . Some of the robust automatic feature engineering capabilities include:

Numeric Transformers: Interactions, Binning, Clustering, Target Encoding, Weight of Evidence, Truncated SVD, DBSCAN, TNSE, UMAP

Time Series: Date & DateTime, Exponentially Weighted Moving Averages, Lags, Interactions, and Aggregations

Categorical: One Hot Encoding, Cross Validation Target Encoding & Numeric Encoding, Weight of Evidence

Text Transformers: BERT, BiGRU, Text CNN, CharCNN, TFIDF

Time Transformers: Dates (Days, Months, Years, Seconds etc), Holidays

Image Models and Transformers: Image AutoML, Image Vectorizer

The H2O AI Cloud rapidly accelerates the speed at which data scientists and data engineers can analyze and prepare a dataset for modeling, enabling them to make models with more accuracy, speed and transparency.

Learn more about the latest release of H2O AI Cloud 21.10 here.

 

About the Author

Benjamin Cox

Ben Cox is a Director of Product Marketing at H2O.ai where he helps lead Responsible AI market research and thought leadership. Prior to H2O.ai, Ben held data science roles in high-profile teams at Ernst & Young, Nike, and NTT Data. Ben holds a MBA from the University of Chicago Booth School of Business with multiple analytics concentrations and a BS in Economics from the College of Charleston.

Leave a Reply

+
H2O LLM DataStudio Part II: Convert Documents to QA Pairs for fine tuning of LLMs

Convert unstructured datasets to Question-answer pairs required for LLM fine-tuning and other downstream tasks with

September 22, 2023 - by Genevieve Richards, Tarique Hussain and Shivam Bansal
+
Building a Fraud Detection Model with H2O AI Cloud

In a previous article[1], we discussed how machine learning could be harnessed to mitigate fraud.

July 28, 2023 - by Asghar Ghorbani
+
A Look at the UniformRobust Method for Histogram Type

Tree-based algorithms, especially Gradient Boosting Machines (GBM's), are one of the most popular algorithms used.

July 25, 2023 - by Hannah Tillman and Megan Kurka
+
H2O LLM EvalGPT: A Comprehensive Tool for Evaluating Large Language Models

In an era where Large Language Models (LLMs) are rapidly gaining traction for diverse applications,

July 19, 2023 - by Srinivas Neppalli, Abhay Singhal and Michal Malohlava
+
Testing Large Language Model (LLM) Vulnerabilities Using Adversarial Attacks

Adversarial analysis seeks to explain a machine learning model by understanding locally what changes need

July 19, 2023 - by Kim Montgomery, Pramit Choudhary and Michal Malohlava
+
Reducing False Positives in Financial Transactions with AutoML

In an increasingly digital world, combating financial fraud is a high-stakes game. However, the systems

July 14, 2023 - by Asghar Ghorbani

Ready to see the H2O.ai platform in action?

Make data and AI deliver meaningful and significant value to your organization with our state-of-the-art AI platform.