# Stacked Ensembles and Word2Vec now available in H2O! By H2O.ai Team | minute read | February 08, 2017 Prepared by: Erin LeDell and Navdeep Gill

## Stacked Ensembles

ensemble <- h2o.stackedEnsemble(x = x, y = y, training_frame = train, base_models = my_models)


Python:

ensemble = H2OStackedEnsembleEstimator(base_models=my_models)
ensemble.train(x=x, y=y, training_frame=train)


Full R and Python code examples are available on the Stacked Ensembles docs page . Kagglers rejoice!

## Word2Vec

### Technical Details

H2O’s Word2Vec is based on the skip-gram  model. The training objective of skip-gram is to learn word vector representations that are good at predicting its context in the same sentence. Mathematically, given a sequence of training words $w_1, w_2, \dots, w_T$, the objective of the skip-gram model is to maximize the average log-likelihood
$$\frac{1}{T} \sum_{t = 1}^{T}\sum_{j=-k}^{j=k} \log p(w_{t+j} | w_t)$$
where $k$ is the size of the training window.
In the skip-gram model, every word w is associated with two vectors $u_w$ and $v_w$ which are vector representations of $w$ as word and context respectively. The probability of correctly predicting word $w_i$ given word $w_j$ is determined by the softmax model, which is
$$p(w_i | w_j ) = \frac{\exp(u_{w_i}^{\top}v_{w_j})}{\sum_{l=1}^{V} \exp(u_l^{\top}v_{w_j})}$$
where $V$ is the vocabulary size.
The skip-gram model with softmax is expensive because the cost of computing $\log p(w_i | w_j)$ is proportional to $V$, which can be easily in order of millions. To speed up training of Word2Vec, we used hierarchical softmax, which reduced the complexity of computing of $\log p(w_i | w_j)$ to $O(\log(V))$

## Tverberg Release (H2O 3.10.3.4)

Below is a detailed list of all the items that are part of the Tverberg  release.
List of New Features:
PUBDEV-2058 – Implement word2vec in h2o (To use this feature in R, please visit this demo )
PUBDEV-3635 – Ability to Select Columns for PDP computation in Flow (With this enhancement, users will be able to select which features/columns to render Partial Dependence Plots from Flow. (R/Python supported already). Known issue PUBDEV-3782 : when nbins < categorical levels, PDP won’t compute. Please visit also this post .)
PUBDEV-3881 – Add PCA Estimator documentation to Python API Docs
PUBDEV-3902 – Documentation: Add information about Azure support to H2O User Guide (Beta)
PUBDEV-3739 – StackedEnsemble: put ensemble creation into the back end.

List of Improvements:
PUBDEV-3989 – Decrease size of h2o.jar
PUBDEV-3257 – Documentation: As a K-Means user, I want to be able to better understand the parameters
PUBDEV-3741 – StackedEnsemble: add tests in R and Python to ensure that a StackedEnsemble performs at least as well as the base_models
PUBDEV-3857 – Clean up the generated Python docs
PUBDEV-3895 – Filter H2OFrame on pandas dates and time (python)
PUBDEV-3912 – Provide way to specify context_path via Python/R h2o.init methods
PUBDEV-3933 – Modify gen_R.py for Stacked Ensemble
PUBDEV-3972 – Add Stacked Ensemble code examples to Python docstrings
List of Bugs:
PUBDEV-2464 – Using asfactor() in Python client cannot allocate to a variable
PUBDEV-3111 – R API’s h2o.interaction() does not use destination_frame argument
PUBDEV-3694 – Errors with PCA on wide data for pca_method = GramSVD which is the default
PUBDEV-3742 – StackedEnsemble should work for regression
PUBDEV-3865 – h2o gbm : for an unseen categorical level, discrepancy in predictions when score using h2o vs pojo/mojo
PUBDEV-3883 – Negative indexing for H2OFrame is buggy in R API
PUBDEV-3894 – Relational operators don’t work properly with time columns.
PUBDEV-3966 – java.lang.AssertionError when using h2o.makeGLMModel
PUBDEV-3835 – Standard Errors in GLM: calculating and showing specifically when called
PUBDEV-3965 – Importing data in python returns error – TypeError: expected string or bytes-like object
Hotfix: Remove StackedEnsemble from Flow UI. Training is only supported from Python and R interfaces. Viewing is supported in the Flow UI.