Return to page

H2O LLM Studio

Create your own large language models, build enterprise-grade GenAI solutions

H2O LLM Studio was created by our top Kaggle Grandmasters and provides organizations with a no-code fine-tuning framework to make their own custom state-of-the-art LLMs for enterprise applications.

The model is trained over three different stages with different data mixes. The first data stage consist of 90.6% of web data which is gradually decreasing to 81.7% at the second stage, and to 51.6% at the third stage. The first two stages include the majority of the tokens: 4.6T and 1.35T tokens respectively, while the third stage comprises of 0.05T tokens. The model is trained over three different stages with different data mixes. The first data stage consist of 90.6% of web data which is gradually decreasing to 81.7% at the second stage, and to 51.6% at the third stage. The first two stages include the majority of the tokens: 4.6T and 1.35T tokens respectively, while the third stage comprises of 0.05T tokens.

Train SLM foundation models

Deepspeed distributed training on GPU clusters

SLMs are cheaper to train and operate than LLMs (fewer GPUs)

SLMs are faster than LLMs (more tokens/sec, lower latency)

SLMs are more customizable than LLMs (faster to fine-tune)

Fine-tune state of the art large language models using LLM Studio, a no-code GUI framework

Fine-tune SLMs for NLP use cases

Distill LLMs into SLMs (with H2O LLM Data Studio

Instruction/chat fine-tuned custom GPTs for mobile and offline applications

Causal Classification and Regression fine-tuned SLMs for conversational use cases

DPO/IPO/KTO optimization and alignment

Fine-tuned SLMs can be more accurate than LLMs for specific use cases

Lower TCO with fine-tuned SLMs compared to using large general-purpose LLMs