Note : this is a community blog post by Team Titans – one of the H2O.ai Wildfire Challenge winners. You can check out their app here .
Forest fires have been getting worse in recent years. According to a report by the WWF, the duration of fire seasons across the globe has increased by 19% on average. The fire season has been starting early in places, with an increase in the duration of the fire season over extensive areas. H2O.ai provided us with an opportunity to leverage data to analyse wildfire events and provide an AI-based solution. It is an interesting research area to see how AI can help in pre, intra as well as post-wildfire events.
Trend analysis can help in wildfire prediction, but the external factors are continuously changing and with climate change, these factors have become more unpredictable than ever. Also, we won’t be able to predict other regions based on the dataset of a specific region as the occurrence of wildfire is subject to a lot of factors. We decided to try an offbeat approach and built an NLP app for the wildfire challenge.
We developed a PyPI package EnvBert and open-sourced it. It is an easy-to-use Python library built on top of Bert models to identify essential environmental data as a part of due diligence in environmental site assessments.
PyPI Link: https://pypi.org/project/EnvBert/
pip install EnvBert
DistilBERT is a small, fast, cheap, and light Transformer model trained by distilling BERT base. It has 40% fewer parameters than bert-base-uncased and runs 60% faster while preserving over 95% of BERT’s performances as measured on the GLUE language understanding benchmark.
We designed an interface for users to input the location and date range. Based on the input, the deep learning models extract the data about the cause, severity, and spread of wildfire as well as information about the extent of contamination and remediation activities.
Our solution focuses on:
Deepak John Reji is an NLP practitioner with experience in developing and designing solutions for data science products. He loves working with Environmental & Natural Assets data, creates videos on prototypes that he experiments with, NLP tutorials, and Podcasts with industry experts in the Sustainability and AI Industry. He is an open-source contributor. Recently he has been researching the topics “Bias & Fairness in AI Models” and “AI in Environment Due Diligence” where he has developed a package called Dbias and co-trained a model named “EnvBert” on environmental data, respectively.
Afreen Aman is an Environmental data scientist who works towards integrating data science and data analytics solutions in the environment, climate change, sustainability, and sustainable finance. She is currently working on developing and managing sustainable digital solutions using NLP for ESG and GHG data analysis. She has developed AI solutions/ prototypes for sustainable finance. She has participated in various national and international Conferences for poster and paper presentations and has published papers in international journals. She has co-trained BERT on environmental data and hosted a python package: “EnvBert” on PyPI.