May 10th, 2023

Insights from AI for Good Hackathon: Using Machine Learning to Tackle Pollution

RSS icon RSS Category: Hackathon
AI for Good hackathon

At H2O.ai, we believe technology can be a force for good, and we’re committed to leveraging its power to create a positive impact in the world. As part of this commitment, we recently organized an AI for Good Hackathon during the H2O World India event, where participants had the opportunity to apply their data science skills to a real-world use case related to pollution in India.H2O Olympics Hackathon, which ran from April 8 - April 16.

The hackathon ran from April 8th to April 16th and saw over 250 participants submit innovative solutions to combat pollution. Participants were given access to data sets related to air pollution and were asked to develop solutions using machine learning models.

Use Case: Predicting the Air Quality Index of Indian Cities using Machine Learning

Air pollutants in India

Air is what keeps humans alive. Monitoring it and understanding its quality is of immense importance to our well-being. 

In this hackathon, participants were given the opportunity to use their data analysis and machine learning skills to forecast the AQI for major Indian AQI stations for the next 28 days. The dataset had information about several air pollutants that directly affect the Air Quality Index.

Dataset Details

The dataset consisted of historical daily average pollutants, including SO, CO, PM2.5, and other important factors that affect the air quality index. The challenge in this competition was to forecast average AQI levels across different stations in India for the next 28 days. The training data consisted of  2 years of historical data for 40 Indian AQI stations and consisted of the following attributes: 

  • ID_Date: Unique identifier of state, stationid and date
  • StateCode: State where the AQI station is located
  • StationId: AQI station ID
  • Date: Date when the observations where recorded
  • PM2.5: Average PM2.5 pollutant level
  • PM10: Average PM10 pollutant level
  • O3: Average O3 pollutant level
  • CO: Average CO pollutant level
  • SO2: Average SO2 pollutant level
  • AQI: Average Air Quality Index – target variable

Additionally, a sample submission file was also made available to specify the submission format. We used our inhouse competition platform called – H2O Olympics, to host the competition.

H2O Olympics apps preview

Evaluation Process

The solutions were evaluated based on their performance, completeness, and storytelling. The performance metric was based on the model’s final performance on an unseen test dataset. Completeness assessed the overall solution and its components, including pre-processing, visualizations, feature engineering, model tuning, and model explainability. Storytelling focused on the business impact and top insights derived from the dataset and the model. Bonus points were also awarded for using H2O.ai libraries during the competition. 

Evaluation criteria

The hackathon was judged by a panel of H2O.ai data scientists who provided valuable insights and feedback to help select the top-performing teams. It inspired participants to showcase their best work.

H2O Olympics hackathon judges panel

Top Teams

After a rigorous evaluation process, the top ten teams demonstrated the potential for machine learning to combat pollution and contribute to a cleaner environment. We were impressed by the participants’ creativity and knowledge in developing end-to-end solutions.

Hackathon Olympics Winners

The winners were announced during the H2O World India event.

Interview with the Winners

We had the pleasure of interviewing the top three hackathon winners, who shared their motivation for participating and their experiences during the hackathon. 

1st Place Winner: Dipayan Sarkar

Dipayan Sarkar, who finished first, mentioned that participating in the hackathon allowed him to push himself out of his comfort zone, network with other professionals, and stay up-to-date on the latest trends and best practices. Following is the overview of Dipayan’s approach:

The approach of H2O Olympics Hackathon winner, Dipayan's, approach

The challenge exposed him to new ideas and approaches and motivated him to participate in more challenges in the future.

2nd Place Winner: Sagar Thackar and Shuchita Mishra

The team used comprehensive data preprocessing, exploratory analysis, analysis of time series features, feature engineering, and trained a series of different machine learning models to predict the AQI.

The second place team's model results

The team also put together a really impressive H2O Wave Application to show the results and predictions. Live Demo: https://h2o-pipeline-aqi.herokuapp.com/site

3rd Place Winner: Nikhil Mishra and Nishchay Dhankar

The team used a really straightforward approach, which was a rule-based model using a mixture of the last seven days’ mean and median. The team also did feature engineering, such as the month of the year, lag features, and rolling features for all pollutants. And also set up the right validation strategy – using the last 28 days as the holdout and running the setup by removing the last 1, 2, and 3 months. The simple model worked well and gave impressive results. The team also developed an H2O Wave application.

The third place winner's app

You can watch the winner’s Interview Panel to understand their solutions in depth.

Conclusion

The AI for Good Hackathon allowed participants to showcase their skills and contribute to a worthy cause. We hope the solutions developed during the hackathon will inspire others to use machine learning to address environmental issues and create a better future for all of us.

About the Authors

Parul Pandey

Parul focuses on the intersection of H2O.ai, data science and community. She works as a Principal Data Scientist and is also a Kaggle Grandmaster in the Notebooks category.

Shivam Bansal

Shivam is the 3x Kaggle Grandmaster, 5 times winner of Kaggle’s Analytics / Data Science for Good Competition, and the winner of several other offline and online competitions. He holds a master's degree from the National University of Singapore and was a Valedictorian. He has extensive cross-industry and hands-on experience in building data science products and applications. He brings a strong blend of technical and business skills with a practical and solution-driven approach. He supports various functions within the company which include - engineering, pre-sales, and customer success. His LinkedIn profile can be found here.

Leave a Reply

+
Recap of H2O World India 2023: Advancements in AI and Insights from Industry Leaders

On April 19th, the H2O World  made its debut in India, marking yet another milestone

May 29, 2023 - by Parul Pandey
+
Enhancing H2O Model Validation App with h2oGPT Integration

As machine learning practitioners, we’re always on the lookout for innovative ways to streamline and

May 17, 2023 - by Parul Pandey
+
Building a Manufacturing Product Defect Classification Model and Application using H2O Hydrogen Torch, H2O MLOps, and H2O Wave

Primary Authors: Nishaanthini Gnanavel and Genevieve Richards Effective product quality control is of utmost importance in

May 15, 2023 - by Shivam Bansal
H2O democratizing LLMs
+
Democratization of LLMs

Every organization needs to own its GPT as simply as we need to own our

May 8, 2023 - by Sri Ambati
h2oGPT blog header
+
Building the World’s Best Open-Source Large Language Model: H2O.ai’s Journey

At H2O.ai, we pride ourselves on developing world-class Machine Learning, Deep Learning, and AI platforms.

May 3, 2023 - by Arno Candel
LLM blog header
+
Effortless Fine-Tuning of Large Language Models with Open-Source H2O LLM Studio

While the pace at which Large Language Models (LLMs) have been driving breakthroughs is remarkable,

May 1, 2023 - by Parul Pandey

Request a Demo

Explore how to Make, Operate and Innovate with the H2O AI Cloud today

Learn More