Return to page

Research Papers

H2O-Danube-1.8B Technical Report
by Philipp Singer, Pascal Pfeiffer, Yauhen Babakhin, Maximilian Jeblick, Nischay Dhankhar, Gabor Fodor, Sri Satish Ambati, Team | Technical Report | January 30, 2024

We present H2O-Danube-1.8B, a 1.8B language model trained on 1T tokens following the core principles of LLama 2 and Mistral. We leverage and refine various techniques for pre-training large language models. Although our model is trained on significantly fewer total tokens compared to reference models of similar size, it exhibits highly competitive metrics across a multitude of benchmarks. We additionally release a chat model trained with supervised fine-tuning followed by direct preference optimization. We make H2O-Danube-1.8B openly available under Apache 2.0 license further democratizing LLMs to a wider audience economically.

Read more
H2O Open Ecosystem for State-of-the-art Large Language Models
by Arno Candel, Jon McKinney, Philipp Singer, Pascal Pfeiffer, Maximilian Jeblick, Chun Ming Lee, Marcos V. Conde, Team | EMNLP Conference | December 06, 2023

Large Language Models (LLMs) represent a revolution in AI. However, they also pose many significant risks, such as the presence of biased, private, copyrighted or harmful text. For this reason we need open, transparent and safe solutions. We introduce a complete open-source ecosystem for developing and testing LLMs. The goal of this project is to boost open alternatives to closed-source approaches. We release h2oGPT, a family of fine-tuned LLMs of diverse sizes. We also introduce H2O LLM Studio, a framework and no-code GUI designed for efficient fine-tuning, evaluation, and deployment of LLMs using the most recent state-of-the-art techniques. Our code and models are fully open-source. We believe this work helps to boost AI development and make it more accessible, efficient and trustworthy. The demo is available at:

Read more
Rapid Deforestation and Burned Area Detection using Deep Multimodal Learning on Satellite Imagery
by Gabor Fodor, Marcos V. Conde, Team | CVPR MultiEarth Workshop | July 10, 2023

Deforestation estimation and fire detection in the Amazon forest poses a significant challenge due to the vast size of the area and the limited accessibility. However, these are crucial problems that lead to severe environmental consequences, including climate change, global warming, and biodiversity loss. To effectively address this problem, multimodal satellite imagery and remote sensing offer a promising solution for estimating deforestation and detecting wildfire in the Amazonia region. This research paper introduces a new curated dataset and a deep learning-based approach to solve these problems using convolutional neural networks (CNNs) and comprehensive data processing techniques. Our dataset includes curated images and diverse channel bands from Sentinel, Landsat, VIIRS, and MODIS satellites. We design the dataset considering different spatial and temporal resolution requirements. Our method successfully achieves high-precision deforestation estimation and burned area detection on unseen images from the region.

Read more
h2oGPT: Democratizing Large Language Models
by Arno Candel, Jon McKinney, Philipp Singer, Pascal Pfeiffer, Maximilian Jeblick, Prithvi Prabhu, Jeff Gambera, Mark Landry, Shivam Bansal, Ryan Chesler, Chun Ming Lee, Marcos V. Conde, Pasha Stetsenko, Olivier Grellier, SriSatish Ambati, Team | Technical Report | June 13, 2023

Applications built on top of Large Language Models (LLMs) such as GPT-4 represent a revolution in AI due to their human-level capabilities in natural language processing. However, they also pose many significant risks such as the presence of biased, private, or harmful text, and the unauthorized inclusion of copyrighted material. We introduce h2oGPT, a suite of open-source code repositories for the creation and use of LLMs based on Generative Pretrained Transformers (GPTs). The goal of this project is to create the world's best truly open-source alternative to closed-source approaches. In collaboration with and as part of the incredible and unstoppable open-source community, we open-source several fine-tuned h2oGPT models from 7 to 40 Billion parameters, ready for commercial use under fully permissive Apache 2.0 licenses. Included in our release is 100\% private document search using natural language. Open-source language models help boost AI development and make it more accessible and trustworthy. They lower entry hurdles, allowing people and groups to tailor these models to their needs. This openness increases innovation, transparency, and fairness. An open-source strategy is needed to share AI benefits fairly, and this http URL will continue to democratize AI and LLMs.

Read more
A Brief Overview of AI Governance for Responsible Machine Learning Systems
by Navdeep Gill, Abhishek Mathur, Marcos V. Conde, Team | NeurIPS 2022 Trustworthy and Socially Responsible Machine Learning (TSRML) Workshop | November 21, 2022

Organizations of all sizes, across all industries and domains are leveraging artificial intelligence (AI) technologies to solve some of their biggest challenges around operations, customer experience, and much more. However, due to the probabilistic nature of AI, the risks associated with it are far greater than traditional technologies. Research has shown that these risks can range anywhere from regulatory, compliance, reputational, and user trust, to financial and even societal risks. Depending on the nature and size of the organization, AI technologies can pose a significant risk, if not used in a responsible way. This position paper seeks to present a brief introduction to AI governance, which is a framework designed to oversee the responsible use of AI with the goal of preventing and mitigating risks. Having such a framework will not only manage risks but also gain maximum value out of AI projects and develop consistency for organization-wide adoption of AI.

Read more
General Image Descriptors for Open World Image Retrieval using ViT CLIP
by Ivan Aerlic, Simon Jégou, Marcos V. | ECCV 2022 Instance-Level Recognition Workshop | October 20, 2022

The Google Universal Image Embedding (GUIE) Challenge is one of the first competitions in multi-domain image representations in the wild, covering a wide distribution of objects: landmarks, artwork, food, etc. This is a fundamental computer vision problem with notable applications in image retrieval, search engines and e-commerce. In this work, we explain our 4th place solution to the GUIE Challenge, and our "bag of tricks" to fine-tune zero-shot Vision Transformers (ViT) pre-trained using CLIP.

Read more