H2O & LiblineaR: A tale of L2-LR

BLOG

tl;dr: H2O and LiblineaR have nearly identical predictive performance.

Overview

In this blog, we examine the single-node implementations of L2-regularized logistic regression (LR) by H2O and LiblineaR .
Both LibR and H2O are driven from the R console on the same hardware and evaluated on the same datasets. We compare regression coefficients and behavior (AUC, Precision, Recall, F1) on hold out data. Before starting into the performance comparison, let’s discuss some of the differences between the two packages.

Implementation Differences

Whooa… there shouldn’t be any modeling differences, right? Well.. no, but there can be subtle implementation differences! Here we explain a few of the implementation details of H2O’s GLM and LiblineaR’s.

H2O

While we don’t focus on the distributed aspects of H2O, it should be acknowledged that H2O’s GLM modeling results come back as if the model was built on a single machine and retain the higher-quality single-machine results! H2O’s state-of-the-art GLM uses Stephen Boyd’s ADMM solver , allows for any combination of L1 & L2, performs automatic factor expansion (easily handling factors with thousands of levels), cross-validation , and optionally performs a grid search over the parameters. There are all sorts of model evaluation metrics reported by H2O’s GLM: AUC, AIC, Error, by-class error, and deviances.

How does H2O distribute GLM?

A Gram matrix is built in a parallel and distributed way. The algorithm is essentially a two-step, iterative process of building a Gram matrix and then solving for betas, building a Gram, solving for betas, and so on, until convergence on the betas. In a distributed setting with N nodes, each node computes a Gram over its data. The Gram’s are reduced together and the result is bit-for-bit identical to doing it all locally. If you want more, here are some slides on what we implemented: http://www.slideshare.net/mobile/0xdata/glm-talk-tomas . Also here is a link to the implementation in our git: https://github.com/0xdata/h2o/tree/master/src/main/java/hex/glm .

LiblineaR

LiblineaR is also an open source implementation of GLM in C++. We note that it is discussed extensively elsewhere [pdf ], but also point out that it too has grid search capabilities and cross-validation.

In order to make fair comparisons, we match the input parameters between H2O and LiblineaR. Note that the cost parameter in LiblineaR is inversely proportional to the lambda used in H2O, scaled inversely by the number of parameters in the model:

$$C = \cfrac{1}{(\ell \times \lambda)}$$

where $$C$$ is the cost parameter in LiblineaR, $$\ell$$ is the number of features, and $$\lambda$$ is the shrinkage parameter.

Hardware, Software, & Datasets

Hardware

All comparisons were performed on a single machine with the following attributes (from /proc/cpuinfo)
processor : 31 vendor_id : GenuineIntel cpu family : 6 model : 45 model name : Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz stepping : 7 microcode : 0x710 cpu MHz : 1200.000 cache size : 20480 KB physical id : 1 siblings : 16 core id : 7 cpu cores : 8 apicid : 47 initial apicid : 47 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 5199.90 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual

Software

We used R version 3.0.2 “Frisbee Sailing” to interface with both LiblineaR (version 1.93) and H2O (build 1064).

Driving H2O from within R is easy! Checkout this blog http://0xdata.com/blog/2013/08/run-h2o-from-within-r/ and some slides from a recent meetup on the subject http://0xdata.com/blog/2013/08/big-data-science-in-h2o-with-r/ and of course this is all documented, http://docs.0xdata.com/Ruser/Rwrapper.html

Datasets

We used 3 datasets: Prostate, Sample Airlines (years 1987 – 2008), and Full Airlines (years 1987 – 2013). These data are publicly available to download . The parameters and models built on these datasets are as follows:

	Prostate	Sample Airlines(’87 – ’08)	Full Airlines(’87 – ’13)
Features in Model	6	3	3
Number of Training Instances	306	24,442	128,654,471
Number of Testing Instances	76	2,692	14,290,947

Prostate: capsule ~ gleason + dpros + psa + dcaps + age + vol

H2O	LiblineaR
family = binomial	type = 0
link = logit	..
lambda = 1 / 700	cost = 100
alpha = 0.0	..
beta_epsilon = 1E-4	epsilon = 1E-4
nfolds = 1	cross = 0

Small Airlines(years 1987 – 2008 sampled): isdepdelayed ~ deptime + arrtime + distance

H2O	LiblineaR
family = binomial	type = 0
link = logit	..
lambda = 0.0033333	cost = 100
alpha = 0.0	..
beta_epsilon = 1E-4	epsilon = 1E-4
nfolds = 1	cross = 0

Full Airlines(years 1987 – 2013): isdepdelayed ~ deptime + arrtime + distance

H2O	LiblineaR
family = binomial	type = 0
link = logit	..
lambda = 0.0033333	cost = 100
alpha = 0.0	..
beta_epsilon = 1E-4	epsilon = 1E-4
nfolds = 1	cross = 0

Numerical Performance

Prostate

Betas	AGE	DPROS	DCAPS	PSA	VOL	GLEASON	INTERCEPT
H2O	-0.06725409	0.5742158	0.1369673	0.4041241	-0.2270453	1.170544	-0.4930266
LiblineaR	0.06878511	-0.582572	-0.1335687	-0.4056746	0.2309275	-1.197098	0.4969579

Mean relative difference: 0.01601093

Test Evaluation	AUC	Precision	Recall	F1 Score
H2O	0.6907796	0.7608696	0.7608696	0.7608696
LiblineaR	0.6907796	0.7608696	0.7608696	0.7608696

Sample Airlines (years 1987 – 2008 sampled)

Betas	DepTime	ArrTime	Distance	Intercept
H2O	0.29061806	-0.027987806	0.1360023	0.19251044
LiblineaR	0.29585398	-0.032675851	0.1373844	0.19258853

Mean relative difference: 0.01759207

Test Evaluation	AUC	Precision	Recall	F1 Score
H2O	0.57245362	0.48479869	0.54078827	0.51126516
LiblineaR	0.56406416	0.35743632	0.56274256	0.43718593

Full Airlines (years 1987 – 2013)

Betas	DepTime	ArrTime	Distance	Intercept
H2O	0.3736	0.0233	0.1317	-0.3933
LiblineaR	0.377	0.0209	0.132	-0.393

Mean relative difference: 0.006942185

Test Evaluation	AUC	Precision	Recall	F1 Score
H2O	0.587	0.527	0.686	0.596
LiblineaR	0.552	0.841	0.625	0.717

Remarks & Conclusions

We can see that the H2O and LiblineaR do not vary much from one another (they all have a small mean relative difference of $$\approx 1 – 2\%$$). Typically, we would expect the objective functions being minimized to match exactly, and allow for differences in the coefficients (we see here that the betas are usually within $$10^{-3}$$). What is emphasized here are the similarities in predictive power, and we note that the AUCs above are all nearly identical.

It would be informative to involve a third reference (e.g. glmnet) to bolster the comparisons here. As this is a first stab at comparing H2O and LiblineaR, it is by no means complete. We will continue to add to this blog other datasets fit for comparison, and additionally give benchmark characteristics.

Additionally, we have skipped over a couple of obvious things: no categoricals were used here and the models aren’t very good. For this comparison, we stripped down to the bare minimum (expanding categoricals for LiblineaR will be something that is tackled in the future) and studied non-categorical data only. All modeling was done by first setting the cost parameter to 100 and then proceeding (nothing magic about $$C = 100$$).

Reproducibility

The data are here: https://s3.amazonaws.com/h2o-bench/blog-2013-10-10
And the R scripts are here: https://github.com/0xdata/h2o/tree/master/R/tests

Explore similar content by topic

H2O.ai Team

At H2O.ai, democratizing AI isn’t just an idea. It’s a movement. And that means that it requires action. We started out as a group of like minded individuals in the open source community, collectively driven by the idea that there should be freedom around the creation and use of AI.

Today we have evolved into a global company built by people from a variety of different backgrounds and skill sets, all driven to be part of something greater than ourselves. Our partnerships now extend beyond the open-source community to include business customers, academia, and non-profit organizations.

BLOG

H2O & LiblineaR: A tale of L2-LR

Overview

Implementation Differences

H2O

LiblineaR

Hardware, Software, & Datasets

Hardware

Software

Datasets

Prostate: capsule ~ gleason + dpros + psa + dcaps + age + vol

Small Airlines(years 1987 – 2008 sampled): isdepdelayed ~ deptime + arrtime + distance

Full Airlines(years 1987 – 2013): isdepdelayed ~ deptime + arrtime + distance

Numerical Performance

Prostate

Sample Airlines (years 1987 – 2008 sampled)

Full Airlines (years 1987 – 2013)

Remarks & Conclusions

Reproducibility

Explore similar content by topic

H2O.ai Team

Ready to see the H2O.ai platform in action?

Why H2O.ai

Products

Resources

Insights

FOR MODEL BUILDERS

FOR DATA SCIENTISTS

FOR ENTERPRISE DEVELOPERS

BLOG

H2O & LiblineaR: A tale of L2-LR

Overview

Implementation Differences

H2O

LiblineaR

Hardware, Software, & Datasets

Hardware

Software

Datasets

Prostate: capsule ~ gleason + dpros + psa + dcaps + age + vol

Small Airlines(years 1987 – 2008 sampled): isdepdelayed ~ deptime + arrtime + distance

Full Airlines(years 1987 – 2013): isdepdelayed ~ deptime + arrtime + distance

Numerical Performance

Prostate

Sample Airlines (years 1987 – 2008 sampled)

Full Airlines (years 1987 – 2013)

Remarks & Conclusions

Reproducibility

Explore similar content by topic

H2O.ai Team

Ready to see the H2O.ai platform in action?

Why H2O.ai

Products

Resources

Insights