Dr. Tanya Berger-Wolf , Co-Founder and Director of AI for conservation nonprofit Wild Me , takes the stage at H2O World Sydney 2022 to discuss AI solutions for wildlife conservation, connecting data, people, and machines. AI can turn a massive collection of images into high-resolution information databases about wildlife, enabling scientific inquiry, conservation, and policy decisions.
Let me show how AI can help people. We’ve all seen these somewhat sad and challenging and terrifying reports over the last many years that come out from the UN and WWF that we have about a million species threatened with extinction.
There’s a million species.
Is it a lot?
Is it a little?
Well, it depends.
How many species are out there?
Any ideas? Anybody yelling out numbers? What do you think? Everybody is worried about saying the wrong thing. Don’t worry.
I’m not talking about bacteria. Nine million species estimated the number of species in the world.
One million is about 10% of the world’s species, right? They’re facing extinction. They’re threatened with extinction. That’s an urgent problem.
The problem is moreso that we actually don’t know precisely what we’re losing and how fast the international unit for conservation of nature red list, which is the official international organization that tracks the biodiversity of the world, also nicknamed “the barometer of life.”
When we say that the species are endangered, it’s because the species commission for that of IUCN Red List for that species has determined, according to some indicators, that that species are endangered.
Of the 160 or so thousand species that they’re tracking out of nine million, what does it mean, “tracking”? About 20,000–more than 20,000 species–their official conservation status is data deficient. Meaning that we don’t have any data on how many, where they are, what’s their range? Is the population going up, down, stable? Nothing. Not even enough to say that they’re endangered or vulnerable. About 66,000 more, the population trend is unknown.
These are not obscure species. These are killer whales. They’re data deficient. Orcas, polar bears, iconic conservation species. The population trend is unknown.
Even for species like whale sharks, which is the largest fish on Earth, where we do know the, presumably the global population size, the global population size in 2009 was estimated to be 103,000 with a standard error between 27,000 and 180,000.
You’re data scientists here, or some of you at least are. That’s a very scientific way of saying we have no clue.
UN and UNESCO and UN have said that biodiversity has a data crisis. So that’s a call to arms to me. I’m a computer scientist, a data scientist, and an AI researcher. I’m not the one to go out in the field and do the field work necessary to save, and to protect the species. But when there is a data problem, this is something we can do. We can do something about it.
What do we do about it? How do we get data on endangered species?
Well, images today are the world’s most abundant, readily available source of information about anything from what you had for lunch or will have, to what animals you see in your backyard, or if you go on the whale watching tour here.
Raise your hand if you’ve taken a selfie or a photograph with your phone in the last week.
Okay, good. That’s why we have so many images.
In fact, just one platform, citizen science platform on nature observation, iNaturalist, has more than 120 million images, observations of nature of about 400,000 species out there. That’s incredible. Just one platform. We have images about the natural world today coming from scientific projects, scientific and conservation projects directly from the big cameras and small cameras from camera traps, the automatic trail motion activated trail cameras from autonomous vehicles underwater on the ground and in the air, as well as, yes, all of us posting our pictures of nature directly on these citizen science platforms or just on social media. There are billions, in fact, images of nature out there. How do we take advantage of all of these images? Well, we’re starting, in fact, to develop the methodology.
Machine learning is maturing, particularly computer vision related. Machine learning is maturing. We can take an image and find objects, put a bounding box around them. We can classify the species. We can even identify individual animals. I’ll tell you a little bit about this. We can do pose estimation and stick a skeleton looking stick figure to show the pose automatically. We can even reconstruct the environment. We can do a lot.
How do we take all these millions of images and actually extract useful information about them? Can you count all the zebras in this image? I’ll give you three seconds. No. Okay, fine. We can, so we built a platform called WildBook, and by we, I mean the nonprofit Wild Me that I am a director of, in addition to all the other titles that I have including co-founder. We can take all these millions of images from all these different sources, and we built an AI pipeline that can find all the ones that contain animals, find where the animals are in those pictures, put a bounding box around each one, including the baby elephant, hiding behind its mom or the zebras in the bush. We identify not only species, but down to individual animals as in Zip the zebra and Jo the giraffe, Tara the turtle, and Willie the whale.
We use this AI pipeline that starts with an image, does the basic species classification, object detection, also pose estimation, and a little bit of pose estimation, but also left, right side things to use to find the viewpoint. Then do basic segmentation, background segmentation, and then find which individuals are potentially identifiable.
Because, you don’t want to send garbage. You don’t want to send blurred images to individual identification that will introduce too much noise. Then we do use a variety of computer vision and machine learning approaches to identify individuals. Particularly we focus on those that oppose relative pose lighting resolution in camera and their end.
That creates this particular approach highlighted here is also dealing with age and the distort body distortion and scarring. This is the same animal as a foal and as a pregnant female. That happens to animals. We’re okay with that. Then when the identification is sufficient, the algorithm is sufficiently confident, we just automatically update it.
When there is uncertainty, this is a human machine partnership. We trade off the human effort needed to curate the results with accuracy. The overall accuracy is the integrity of the system. I can take a whole talk to talk to you about that.
We can do this for any striped spotted, wrinkled, notched animals in the water, on the ground, including the shape of a whale’s fluke or the dorsal fin of a dolphin as an individual ID. That’s the biometric. The identification.
Then with information on when and where the image was taken, you could really start using images to track animals, count them, and even get their social network. Because they’re social. Some of them are social species.
This is a page from a WildBook for citations, whales and dolphins. This is Pinchy, a sperm whale. She’s the most cited individual in that WildBook.
She lives around Dominica and has been cited more than 600 times. She’s a ham, clearly likes her picture taken. You can see there’s a lot of rich metadata. In fact, what you may also be able to see if you squint is that the first records here, going back to 1995. Once the WildBook platform was live, people uploaded the historical data for a lot of the animals that they already had. We can now really look at the longitudinal trends of the population and the species.
Flukebook, the WildBook for citations, has more than two million photographs today. Actually, it is approaching about 3 million by now. That’s with about almost thousand uniquely identified individual whales and dolphins. It’s the official platform for US departments of interior, all of their North American catalog of whales.
If you look, here’s the Australian sightings in Flukebook. You can see not only coastal, but some of them travel quite a bit. In fact, those are Southern Right Whales I checked. They’re the ones that travel in a lot of bottlenose dolphin sightings, all of those. I also checked that off the coast of Sydney, there have not been a lot of whale sightings in the last year, which surprised me because I know there are a lot of whales, Southern Right Whales in fact, here.
If you go on a whale watching tour, or if you have photographs, you can directly upload them to WildBook to Flukebook, or you can in fact tweet your photograph at Tweet-a-Whale service and it’ll automatically send this whole thing to the identification of this particular whale from the picture that you would tweet and add this to WildBook to Flukebook.
That’s not the only social media service that we have directly engaging people to contribute their images to science and conservation. We built a bot, an intelligent agent, which takes publicly posted videos of animals on social media and finds the ones that contain an animal species of interest. In this case, a whale shark, identifies it, adds the information to the appropriate page of WildBook. Then generates and posts text to post in the comments of that poster. “Hey, 2 minutes, 46 seconds, we found this whale shark m max 700. Here’s everything we know about it.”
People respond. Everybody responds. The most common response that we see is like, “Whoa! This is amazing! You are an AI.”
Yes, we are. Well, we are not, but the bot is.
The whole identification pipeline behind us is AI doing all this work. The word that connects the AI is the one that’s connecting people with the animal that they saw on their vacation somewhere in Cancun. Bringing them this connection. Making a very, very human connection with the animal and making them realize that they can contribute to science and conservation just by taking a picture. Just by going on a vacation and seeing this very, very cool animal.
Then it adds this picture. At the bottom of the page, you can see here all the contributors of data to this specific individual.
Because whale sharks are a global species. There is not one project, not one conservation organization that has information even about one whale shark in its entirety. They travel more than 5,000 miles. It is only by bringing this data together, bringing together all these pieces of information, all these pixels that we can build the global picture of this global species.
AI is connecting these pixels and the people and the organization and the data parts to build that picture, to help build this picture.
Now, WildBook for sharks–SharkBook AI–has more than 17,000 uniquely identified individuals. More than 13,000 of them, in fact, are whale sharks. This information comes from about 9,000 citizen scientists who directly contributed data to the website, about 300 conservation and scientific organizations, and one very intelligent agent bringing data from social media.
In fact, that data from social media, from one social media source, there were more sightings of whale sharks in 2018 than all the human contributors combined. We’ve checked, those are, in fact, 96% of those were not observed, not replicates of the direct human contributions. More than 70% of those were individuals that were not replicated during that year, during that time period in the human observations. It is a valuable source of information that is completely untapped today.
That is a source that we can actually use to maybe fill that data gap for biodiversity. In fact, the IUCN Red List for whale sharks in 2016 already was updated using data from SharkBook AI from whaleshark.org at the time it was called. The Species Commission for whale shark species has reviewed data and used the WildBooks data to not only update the global population size estimates, but to update, change the conservation status of the species from vulnerable to endangered, and the population trend from stable to decreasing, not because the species are doing worse, but because we now have the right data to make that decision. To make that assessment and to have the right indicators. That changes a lot because that change in the conservation status means that there is a whole new set of policies that kicks in to protect the species.
There is a whole new set of resources that is now being allocated to protect this species. The most comprehensive study on the biology of whale sharks was also published using data from WildBook, from SharkBook AI.
In 2017, It was authored by 37 authors, most of whom met through the pages of WhaleBook. That bottom thing here’s who contributed data to this particular page, that’s how they met. By putting that data together, they were able to understand this isn’t an anomaly of migration patterns and many, many other things about the biology of this magnificent species.
There is also a study that just came out understanding the assessment of the status of the species, also co-authored by many authors who met through the pages of WildBooks.
This is the paper that even though I’m not a co-author, it is one of the papers that I’m most proud of in my entire career because it is the work that we do in AI that enabled this publication. This understanding.
We have platforms for more than 60 species today. This is from marine to terrestrial covering the entire globe. My favorite species, which slid off the page a little bit, is the weedy and leafy seadragons. This is the Australian species right off the coast of Victoria in the South led by the organization, the SeadragonSearch. They’re the ones who initiated this WildBook, that platform. They’re indicator species for the coral reef and the health of the ocean. They’re seahorses, they’re just alien looking, amazing, beautiful, and wonderful. And yes, individually identifiable.
One of the newest species added to any WildBook are orcas. One of the newest species, and it’s added to FlukeBook, orcas killer whales, are now part of FlukeBook. We’re hoping in the few months that it’s been part of the WildBook, we are already having thousands and thousands of observations, most, many of them historically added. We’re hoping that very, very soon, the Species Commission for killer whales for a IUCN Red List, will be able to reassess the status of the species and change it from data deficient to something that reflects the true conservation status of the species.
The same technology that was used to create these platforms was also used to enable the first ever full census of an entire species–the endangered Grévy’s zebra–using photographs from ordinary people just driving around the country of Kenya for two days.
In January, 2016, hundreds of people from school kids and park rangers to tourists with telephoto cameras and the US ambassador to Kenya at the time, Robert Godec, right there in the middle, took more than 40,000 images of this beautiful animal. Ninety-five percent of the population is in Kenya. Using our entire image analysis AI stack, we produced the estimate, because we’re able to identify individual animals. There is a very simple piece of statistics that allows you to determine the population size 2352, plus minus 72.
The confidence bounds were so tight on this that Kenya Wildlife Service did a complete 180 from saying, “this is not how we count our zebras” to, “oh yeah, this is exactly how we’re going to count our zebras and now you’re on the hook every two years to do this event.”
So we did.
In January, 2018, more people took more pictures and the population size estimate was 2800, plus minus 150. If you’re keeping track, yes, it is growing and it’s growing enough and the confidence bounds so tight that it is true separation. We can confidently say yes, the population is growing.
The event was also repeated in January, 2020, right before the pandemic again. We’re holding the results back a little bit because right now Kenya is in the middle of the worst drought in recent memory, and there are many, many, many animals that are dying. Both wild and domestic animals and humans are also feeling the effects of the drought acutely.
Right now, people on the ground are trying to save the animals. We skipped the event during the pandemic. We’re still debating whether, or rather conservation organizations that are leading the event or debating whether this would be the right time to put the animals in distress to get people just driving around taking pictures. Maybe the local organizations have a very, very scaled down event this year.
The data and the fact that everybody participated. The tagline for the event was “Kenyans Powering Conservation” because among the people that participated, there was everybody. Our youngest photographer was three years old. Because all you have to do is take a picture. There were people who’ve never been to a conservancy or were not interested in seeing wild animals who were able to participate. Kids who were not allowed to enter these conservancies because they’re for tourists were able to participate.
With all of this data, with all of this participation, people saw this event and the whole process, they contributed the data. They saw what was going to happen with the data because working with the local nonprofits, there was a lot of education that we did. A lot of training. A lot of explaining. The local organizations the Grévy’s Zebra Trust and Wildlife Direct put in the brochure explaining what the event is about, what is the technology doing, what’s going to happen with this data, and how the government is going to make decisions based on that data. We’ve driven up and down the country explaining this whole process in English that was translated to Swahili, that was translated to Northern Maasai Samburu in many cases, just to make sure that everybody could participate.
We created a whole protocol of how people that don’t have GPS enabled cameras could still contribute pictures that would have GPS tags attached to them with crazy QR code and cards and everything.
We also trained local conservancy staff to use WildBook to build the capacity to democratize AI very much in the spirit of H2O to make sure that we’re not the bottleneck that takes the data, not the organization that takes the data, does something, some magic with it, and then delivers black box results that policy then has to act on. No. All of that process has to be done in Kenya by Kenyans. It is because of that trust, because of that partnership, that Kenya Wildlife Service, together with the governors of the six counties that have Grévy’s zebras issued the proclamation of the Grévy’s Zebra Management Plan, the Endangered Species Management plan, that really put money, resources, land, and policy protection for this species based on those data, based on this process.
Simon Gitau, the Associate Director of Kenya Wildlife Service, said the sentence that nearly made me cry.
He said, “this shows the power of citizen science and machine learning for conservation.”
There are three firsts in that sentence. That’s the first time that Kenya Wildlife Service used a protocol. The process that was not the way they normally did, just fly over or drive by sample counting. This was the first time that they opened the process to everybody in the country. This was the first time they used the words machine learning in a sentence. That’s fantastic.
The IUCN Red List entry for Grévy’s zebra also was updated based on the data from this event, and it’ll be updated again after new data. That’s fantastic.
This real, true impact of AI on wildlife conservation, filling that gap.
But what’s gold for scientists and conservation managers is unfortunately also highly useful data for wildlife criminals and poachers.
Geotagged images of animals are beacons for poachers–posted on social media particularly–are beacons for poachers that can get to a location of an elephant picture posted on social media within hours, unfortunately, with really tragic outcomes for the species.
There is a lot of call now and understanding, bigger and bigger understanding, that we need to protect this data, that it is as sensitive as some of the banking data in some cases. We’ve done some work also outlining the challenges, in some cases, the research challenges. We actually don’t know how to exactly protect this data. At Wild Me, the data is stored in a highly secure part of the cloud, and we have sophisticated context to where dynamic access control is implemented and are working to understand the data leakage when you aggregate this kind of data.
The other aspect of using data the way we use it for extracting biodiversity information from it is bias. There are lots and lots and lots of sources of bias in these kinds of data. Of course, there is the real population, let’s just focus on figuring out how many, what’s the population size? I mean, there are biases in terms of the geospatial bias, of being at the right place at the right time. It doesn’t matter whether you are a human, a trail camera, or a drone, just being able to see these species. Then the ability to take a picture of the species, a good picture of the species for that matter. There is the photo taking bias for humans. It’s huge. People really vary in what they take pictures of, even on the same safari tour of two hours.
Then there is the bias of what people post on social media. If we are going to use social media photographs, which social media, what are the protocols that allow us to access data on that platform? And also how we’re using and estimating. What we’re really worried about is how to come up with unbiased estimates. We put a lot of effort into researching and implementing these models.
Hopefully, over the last several minutes, well, 20+ I gave you a taste of how AI and data science can enable science, conservation, and public engagement at large scale and high resolution over time, space, and individuals by connecting people, data, and vast geographic areas to really put this global picture of biodiversity. I have to thank many, many, many people and organizations that support this effort and the thousands of those who actually take this and implement conservation policy in the field and our financial supporters, including not the least H2O.ai.