The aim of the project is to predict the probability of wildfire occurrence in Turkey for each month in 2020. As a result of these predictions, it is aimed to carry out more intensive monitoring studies in possible fire areas and to respond to fires very soon after they start. It is also aimed to derive generalizable relations by interpreting the model outputs and the importance attributed to each variable used by the model.
The wildfires in Turkey started in August 2021, spread over very large areas and resulted in the destruction of large areas and living things due to lack of intervention, have created a big agenda throughout the country. The public and politicians often complained about this technical inadequacy and suggested that improvements should be made in this regard. Within the scope of the project, it was desired to see whether an estimation could be made on this subject throughout the country, and if so, how successful the results would be.
The goal of the project is to estimate the probability of a wildfire occurrence for each month of 2020 for each grid segment by dividing the area of Turkey in latitude and longitude with 1 degree precision. It is defined in the H2O Competition Overview as “Predicting the behavior of wildfires”.
Since the probability of wildfire occurrence in certain areas in the future is being calculated, the following groups and individuals can benefit from this project:
Anıl Öztürk entered the competition alone. Anıl is a Machine Learning Engineer with a Master’s Degree in Computer Engineering from Istanbul, Turkey. He has mostly worked on tabular data, deep learning and deep reinforcement learning. He is passionate about following state-of-the-art, competing in Kaggle and whining about stochasticity. He is trying to gain experience in different domains by participating in local and global competitions as much as possible. This competition was also very interesting for Anil because he had no experience with geospatial data.
Historical active fire and temperature observations are used as features. LightGBM (an advanced decision tree algorithm) was used in the project. The following factors were effective in choosing this algorithm:
Within the scope of the competition, an interactive web application was requested. I designed an interface where users can make all the analyses and adjustments to the model. I tried to use self-explanatory visualizations whenever possible. Users can access non-technical details of the project, dataset explanations, dataset analysis graphics, model predictions and visualizations of that predictions, model evaluation and metrics screen from within the application.
When I examined the feedback after the first-submission stage, I saw that the accuracy was one of the parts that had a low impact on the score. All scores and feedback statements were focused on usability, simplicity, and explainability. That’s why I paid the most attention to the following during the competition:
This competition has been very beneficial for me in terms of the mentality of considering corporate and stakeholder requirements while designing a solution. In most competitions, these real-life requirements can be overlooked when trying to maximize a score metric.