Gradient Boosting Machine with H2O
May 2020: Seventh Edition
Contents
| Section | Title | Page |
|---|---|---|
| 1 | Introduction | 4 |
| 2 | What is H2O? | 4 |
| 3 | Installation | 5 |
| 3.1 | Installation in R | 5 |
| 3.2 | Installation in Python | 6 |
| 3.3 | Pointing to a Different H2O Cluster | 7 |
| 3.4 | Example Code | 7 |
| 3.5 | Citation | 7 |
| 4 | Overview | 8 |
| 4.1 | Summary of Features | 8 |
| 4.2 | Theory and Framework | 9 |
| 4.3 | Distributed Trees | 10 |
| 4.4 | Treatment of Factors | 11 |
| 4.5 | Key Parameters | 12 |
| 4.5.1 | Convergence-based Early Stopping | 13 |
| 4.5.2 | Time-based Early Stopping | 13 |
| 4.5.3 | Stochastic GBM | 13 |
| 4.5.4 | Distributions and Loss Functions | 14 |
| 5 | Use Case: Airline Data Classification | on |
| 5.1 | Loading Data | 15 |
| 5.2 | Performing a Trial Run | 16 |
| 5.3 | Extracting and Handling the Results | 19 |
| 5.4 | Web Interface | 20 |
| 5.5 | Variable Importances | 20 |
| 5.6 | Supported Output | 20 |
| 5.7 | Java Models | 21 |
| 5.8 | Grid Search for Model Comparison | 21 |
| 5.8.1 | Cartesian Grid Search | 21 |
| 5.8.2 | Random Grid Search | 23 |
| 6 | Model Parameters | 24 |
| 7 | Acknowledgments | 28 |
| 8 | References | 29 |
| 9 | Authors | 30 |
To read the eBook, click the download link above.