
By: Erin LeDell
In September, H2O.ai released a new open source software project for GPU machine learning called H2O4GPU. The initial release (blog post here) included a Python module with a scikit-learn compatible API, which allows it to be used as a drop-in replacement for scikit-learn with support for GPUs on selected (and ever-growing) algorithms. We are proud to announce that the same collection of GPU algorithms is now available in R, and the h2o4gpu R package is available on CRAN.
The R package makes use of RStudio’s reticulate R package for facilitating access to Python libraries through R. Reticulate embeds a Python session within your R session, enabling seamless, high-performance interoperability and was originally created by RStudio in an effort to bring the TensorFlow Python library into R.
This is exciting news for the R community, as h2o4gpu is the first machine learning package that brings together a diverse collection of supervised and unsupervised GPU-powered algorithms in a unified interface. The initial collection of algorithms includes:
- Random Forest, Gradient Boosting Machine (GBM), Generalized Linear Models (GLM) with Elastic Net regularization
- K-Means, Principal Component Analysis (PCA), Truncated SVD
h2o4gpu has a functional interface. This is different than many modeling packages in R (including the h2o package), however, functional interfaces are becoming increasingly popular in the R ecosystem.
Here’s an example of how to specify a Random Forest classification model with a non-default value for the max_depth
parameter.
model <- h2o4gpu.random_forest_classifier(max_depth = 10L)
To train the model, you simply pipe the model object to a fit()
function which takes the training data as arguments. Once the model is trained, we pipe the model to the predict()
function to generate predictions.
Here is a quick demo of how to train, predict and evaluate an H2O4GPU model using the Iris dataset.
Detailed installation instructions and a comprehensive tutorial is available in package vignette, so we encourage you to visit the vignette to get started.
H2O4GPU is a new project under active development and we are looking for contributors! If you find a bug, please check that we have not already fixed the issue in the bleeding edge version and then check that we do not already have an issue opened for this topic. If not, then please file a new GitHub issue with a reproducible example. ?
- Here is the main GitHub repo. If you like the package, please ? the repo on GitHub!
- If you’re looking to contribute, check out the CONTRIBUTING.md file.
- All open issues that are specific to the R package are here.
- All open issues are here.
Thanks for checking out our new package!
— Navdeep Gill, Erin LeDell, and Yuan Tang