We’re now modeling over 80,000 taxa! A conversation with Alex Shepard

We released a new computer vision model today. It has 80,962 taxa, up from 79,797. This new model (v2.8) was trained on data exported on September 17th. We’re celebrating crossing this 80,000 taxa milestone with a conversation with iNaturalist machine learning lead Alex Shepard.

Alex, you’ve been with iNaturalist since 2014. How did you first get interested in computers?

I have an undergraduate degree in history and a masters degree in fine art, but I come from a family that worked for Apple. So as a kid, I always had access to computers. I taught myself how to program as a kid not because I wanted a career as an engineer but because I’ve always been around and love computers. I’ve always found this stuff to be fascinating.

How did you first get iNaturalist involved with machine learning and computer vision?

I was excited for a long time about some kind of statistic or tool to help the community make identifications. We had a tool called the Identotron that let you choose species based on things like color and location - a sort of primitive trait based matching. I was looking for ways to improve that and was coming up with pretty dumb approaches. For example, I was looking into clustering the kinds of species people were seeing to try to predict what they might find next. Around 2016, there were these breakthroughs in computer vision that caught my attention. Before that, there were models that could do things like recognizing hand-drawn digits - very simple recognition tasks like that - or could recognize the same face from the exact same pose, but they couldn’t recognize that face from other angles. Deep Convolutional Nets were the first kinds of models that could do that.

How did you go from those explorations to iNaturalist releasing its first computer vision model?

There were a lot of things that fell into place at the right time. People were just coming out with open source tools to help build on these Deep Convolutional Nets that I had access to experiment with. At the same time, NVIDIA very generously donated a GPU that we could use to train models beyond the experimental proof of concept phase. We also met Grant Van Horn and colleagues at Visipedia who brought a huge wealth of formal background in machine learning and statistics. Beyond the fact that we trained our early models on their code, their help and all the things they were able to teach us was invaluable - ranging from how to train out full models to training models with performance and at scale, to how to pick the right photos, to how to know when we’re done training, all those things.

How have these machine learning tools integrated into the iNaturalist community and platform?

We really started with the goal of easing the burden on the community identifying all these photographs. As a tool, computer vision gives the identifier a great starting place that’s a good ballpark. On the other hand, the model often makes predictions that aren’t correct. It’s just meant to be suggestions, but it does enter the whole conversation about things like data quality. This is an ongoing challenge for us. I think iNaturalist data quality continues to be really good on the whole, but we have to be vigilant about data quality and how the Computer Vision Model feeds into that.

What are we not currently doing that you’re excited about?

There’s a bunch of technical things that I’m excited about and want to get to but we haven’t yet. For example, I think bounding boxes could really help improve accuracy and there’s interesting roles for the community to help us do that well. I’m really interested in model explainability. For example, can a model not just suggest what’s in the photo but tell you why? And I’m always interested in finding ways to help improve accuracy of our models and improve the speed at which we train. Our dataset size is growing rapidly so we have to keep finding ways to increase our capacity to train, ranging from tweaking what we have and exploring entirely different kinds of training architectures. I’m interested in how the Computer Vision Model and the Geomodel can better work together to offer suggestions. Likewise, can we model other parts of iNaturalist in ways that help us better understand iNaturalist or help us understand species in ways that make the platform more engaging? Lastly, I’m interested in these models teaching us new things, so we’re not just using the model to represent or synthesize human knowledge already in the community but using these models to actually teach us something new about the natural world.

The model released today has over 80,000 taxa, how has the model grown over time?

Our first model was released in 2017 and had about 13,000 species in it. In 2019, we had a Microsoft AI for Earth grant to make improvements to how we made predictions up and down the taxonomy. This is the model that we used when we released the Seek in-camera suggestions. In 2021, we got an Amazon Machine Learning Research Award and used this opportunity to figure out how to speed up training time to reduce turnaround time to get new information from the community reflected in the model. The first model using this approach in 2022 (1.0) had 55 thousand taxa. Since then in monthly model releases we’ve been adding 1 to 2 thousand species each time leading up to today's model with over 80 thousand taxa. We’ve been training all of these models off of donated NVIDIA GPUs. Thank you NVIDIA!

Is this sustainable?

Good question. If you think about the model as a dice where there’s some probability associated with each side, we now have an 80 thousand sided dice. Every time we add a side it’s slightly harder to get the right answer just because there’s more choices. I’m really proud of the fact that we’ve been able to keep accuracy high despite adding so many new choices. I think this has a lot to do with the high quality of the iNaturalist data we’re training on. But we don’t know if there are limits we will hit down the road. We’re constantly working to squeeze a bit of accuracy out of the model or shave some time off of training runs. We’re definitely reaching the limits of our current set of hardware. There’s always work to do.

Final thoughts?

Sometimes we talk about adding taxa to the model as adding rarer and rarer species, but I like to think of this as adding common species from increasingly overlooked parts of the globe. This makes me happy because I know as the model grows we’re making iNaturalist more useful and available to more and more people from around the world.

On that note, here is a sample of new species added to v2.8!

Posted on October 20, 2023 05:23 AM by

loarie

Comments

You talk of 80,000 taxa, but surely it is more complex than that? For instance, a plant taxon will contain a mix of habitat, habit, leaves, flowers, fruit, bark, new growth (and many others) photos - sometimes in isolation, but often in combinations. And for instance, moths we have imago, larva, pupa and egg, but also potentially larval host plant, or associated parasites and diseases. So the training is really on 80,000 multiplied by several taxon features. Does the model implicitly segregate these or does it just organically evolve as the model is fed named pictures? (if a future AI is going to explain why, it will need to be aware of these?)
Adding geography (Geomodel) seems essential, but would a seasonal (please note: there are two hemispheres) or diel model not also be useful?

Posted by tonyrebelo 7 months ago

Thank you. My contribution this time
https://www.inaturalist.org/taxa/595553-Trachyandra-chlamydophylla

113 obs, but the target I usually expect is around 60. I wonder why this one took twice as many?

Posted by dianastuder 7 months ago

that's GREAT! AI empower Nature

Posted by mebox 7 months ago

I think the "Taxa" that are >80,000 refers to the endpoints or classifications in the training and output datasets. The pics drawn from those classifications/known IDs will certainly have other features (habitat, host, etc) to them that are displayed in the pics, but the CV model isn't identifying each of those features specifically as such. I would think that a more complex model could potentially be trained (eg having separate classifications for larvae and adults of the same species) if there was enough training data.

Posted by cthawley 7 months ago

@alexshepard what do you think of setting up some sort of bilateral arrangement with the Wiki Commons project? There are many photos on iNaturalist that are not on commons, and vice versa. I know some have license compatible issues. But there are still many that are compatible.

Anyway, I was thinking, once an observation is confirmed ("research grade") and if the license of the photos are compatible, then automatically uploading them to the correct category on Wiki Commons. And going the other way, when new photos are uploaded in a specific category (e.g. Category:Genus_species or even just Category:Genus) on commons, then it can be automatically uploaded to iNaturalist as an observation which can then be confirmed by the iNaturalist community. WikiCommons also has lots of metadata that you can import as well.

E.g. this user has uploaded over 17000 very high quality images on Wiki Commons, but does not seem to be active on iNaturalist at all:

https://commons.wikimedia.org/wiki/User:SAplants

I know this is a ton of work, but I feel like it's a great way the projects could mutually assist one another.

Posted by rooiratel 7 months ago

I love your final thought, about adding common taxa from "overlooked parts of the globe." There are so many reasons that's important.

Posted by janetwright 7 months ago

@alexshepard is the BEST!!! :) Massive thanks for everything you continually do for the community here on iNat. :)

Posted by sambiology 7 months ago

DOES SIZE MATTER ? I post loads of moths from Botswana which are 5 mm long and load more which are 15 mm long. Most cannot be IDd by human or machine but isnt the first step in assigning them to a possible family or eliminating other familes is to consider size.. Recording size doesnt seem to be very important in iNat ! Isn't it useful for machine learning ? Would documenting approximate size in a methodological
way be of help and what is the best way of recording size for different organisms. Should size matter more ? If some people use inch scales and others use cm scales how does machine learning cope with that ?

Posted by botswanabugs 7 months ago

Thanks for explaining the model to us.
This is a tool that not only helps in the approval of the quality of the images (therefore, in the quality of the information that it represents), but also helps in the motivation of iNat members to continue posting their records, and thereby increasing the number of new taxa, since it ends up being frustrating that the posting of a record (regardless of its quality or clarity), is not identified or classified. Therefore, it is pleasant to know that the learning process of the model already involves 80,000 taxa, that is wonderful.

Posted by orlandomontes 7 months ago

Every day I am grateful for iNat's rapidly improving CV and the work of @alexshepard and the whole team. That said, I had to laugh at myself when I followed the link above to "Deep Convolutional Nets" which points to a Wikipedia article on "Convolutional neural network." I read the first few paragraphs and recognized that all the words were in English but came away not understanding a darned thing that was written there! Ha! The advanced nature of such computational research (and the explanations thereof) leave me feeling like quite a dunce! Maybe in a future lifetime, I'll change careers and learn a little more "engineer-speak". In the meantime, Thank You, Alex, for all you do!

Posted by gcwarbler 7 months ago

I'm not an expert in the field but I suspect you could use the donation of a quantum computer?

Posted by oldcoot 7 months ago

Someday, I hope to be as cool as @alexshepard.

Posted by bobby23 7 months ago

Out of 868 moth species in my yard, 116 are still not in the computer vision model. A few are added on each update (just 3 with this one). Enough pictures for most of them are already available on Inat, but we need more research grade observations. At this point, involving experts to id more pictures of taxa still waiting is much needed! Bioblitz are good, but identification blitz would be even better to improve the next models.

Posted by imbeaul 7 months ago

I second that. iNat has lots of observations and observers. However, in some regions we are desperately short of competent identifiers who know the rarer species, or even common species from remote areas. Any ideas about where to find specialist to participate in an IDblitz would be most welcome. We do have some brilliant specialists, but in most groups there are either no specialists, or they are overwhelmed, or there is no one to push their IDs to research grade.
As iNat continues to grow this problem will become more and more acute. The CV AI will soon become the only way out of this bottleneck, even though it does not know about the rarer or remoter species. Hopefully that will allow the specialists to focus on the rarer species for fine-tuning the CV AI.

Posted by tonyrebelo 7 months ago

Yay! Thank you Alex!

Posted by susanhewitt 7 months ago

@dianastuder "113 obs, but the target I usually expect is around 60. I wonder why this one took twice as many?" only 52 of them have research grade.

Posted by rudolphous 6 months ago

We’re now modeling over 80,000 taxa! A conversation with Alex Shepard

Comments

Add a Comment