New Computer Vision Model

We’ve released a new computer vision model for iNaturalist. This is our first model update since March 2020. Here’s what you need to know.

It’s a lot bigger

The number of taxa included in the model went from almost 25,000 to over 38,000. That’s an increase of 13,000 taxa compared to the last model, which, to put in perspective, is more than the total number of bird species worldwide. The number of training photos increased from 12 million to nearly 21 million.

We’ve sped up training time

This may sound suspicious given the long delay, but these delays were mainly caused by the pandemic, which stalled our existing plans (see future work below). We have significantly decreased model training time with the new approach we used here.

Our previous training jobs took 4.5 months to train 12 million photos. Our best estimate is that with that approach a new model, given the increased volume of data, would have taken around 7 months. Instead it took 2.5 months (from January to mid-March). We’re very excited about this new approach, getting us back on track to release two models a year moving forward.

Accuracy

Accuracy outside of North America has improved noticeably in this model. We suspect this is largely due to the nearly doubling of the data driving this model in addition to recent international growth in the iNaturalist community. We’re continuing to work on developing a better framework for evaluating changes in model accuracy, especially given tradeoffs among global and regional accuracy and accuracy for specific groups of taxa.

The recent changes removing non-nearby taxa from suggestions by default have helped reduce this global-regional accuracy tradeoff, but there’s still more work to do to improve how computer vision predictions are incorporating geographic information.

Taxon Page Indicator

We’ve also released a new feature for taxon pages on the website which allows you to see which taxa are included in the model. This badge only appears on species pages, not pages of genera, families, etc.

Future work

This new model is the one the iNaturalist website and apps use. Seek by iNaturalist requires an additional step to make a model compressed enough to run on the device. Stay tuned for an update to the Seek model soon.

The pandemic prevented iNaturalist staff from accessing our offices at the California Academy of Sciences for over a year. Unfortunately, we had ordered a new, more powerful machine for training these models just before that happened. The goal was to pilot a new software approach with this new hardware.

While we did lose a lot of time due to uncertainty, @alexshepard ended up developing and testing the new software approach on a jury-rigged machine in his living room, ultimately training the model we are releasing now. Piloting this new software approach worked well and we’re eager to get the next model training using it on the new machine, which we are beginning to set up at the California Academy of Sciences. This new hardware combined with the new software approach should help improve training time.

We’re eager to get the next model training in the next month. One challenge we know we will continually face is finding ways to train efficiently on the rapidly increasing iNaturalist dataset. Here are three ways you can help:

Share your Machine Learning knowledge: iNaturalist’s computer vision features wouldn’t be possible without learning from many colleagues in the machine learning community. If you have machine learning expertise, these are two great ways to help:

Participate in the annual iNaturalist challenges: Our collaborators Grant Van Horn and Oisin Mac Aodha continue to run machine learning challenges with iNaturalist data as part of the annual Computer Vision and Pattern Recognition conference. By participating you can help us all learn new techniques for improving these models.

Start building your own model with the iNaturalist data now: If you can’t wait for the next CVPR conference, thanks to the Amazon Open Data Program you can start downloading iNaturalist data to train your own models now. Please share with us what you’ve learned by contributing to iNaturalist on Github.

Donate to iNaturalist: For the rest of us, you can help by donating! Your donations help offset the substantial staff and infrastructure costs associated with training, evaluating, and deploying model updates. Thank you for your support!

Here’s an associated forum post with more technical details associated with the new software approach and hardware if you’re interested!

Posted on July 13, 2021 08:37 PM by

pleary

Comments

Great! Congratulations on this accomplishment. Thanks for all the work.

Posted by sedgequeen almost 3 years ago

Well Done: looking forward to playing with it ...

Posted by tonyrebelo almost 3 years ago

Wonderful! Thank you for the update. 😊

Posted by humanbyweight almost 3 years ago

Great improves. Congratulations! I will take a look on.

Posted by roberto_arreola almost 3 years ago

Wow, great1 Thanks everyone!

Posted by susanhewitt almost 3 years ago

This is really cool -- kudos to everyone on the team involved.

Any summary stats on how the species known by the Computer Vision Model break down across taxonomic groups? Something like, X % of birds with at least 1 iNat obs, Y % of angiosperms, etc. More broadly, is the list of 38k species available to access?

Posted by muir almost 3 years ago

AMAZING!!!!
Congratulations on this !!!!!

Posted by aztekium almost 3 years ago

Exciting, looking forward to testing it out!

Really nice to see you added the badges to show what's been included.
Curious as to why only species level taxa are included and not genera and families?

Posted by sbushes almost 3 years ago

Awesome! I'm very happy to see that many taxa that were previously not recognized now are. Thank you for the hard work!

Posted by someplant almost 3 years ago

Been waiting for this, thanks for the update

Posted by kemper almost 3 years ago

@sbushes Great question, the iNat team were just talking about that this morning. The challenge with showing the badge on higher level taxa is user interpretation. We make suggestions at the leaf level and also using what we call a common ancestor - rolling scores up the tree to find a higher level suggestion with a higher combined score. I know you've been actively investigating ML and our vision system for a while, but in case you haven't seen this video, Ken-ichi explains our process in his keynote at TDWG last year: https://www.youtube.com/watch?v=xfbabznYFV0 - the relevant content starts at about the 16 minute mark.

In that context, how to interpret a badge on a family? Would it mean that the family itself is in the model as a leaf node (ie none of its children are in the model)? Or perhaps would it mean that the family has children that are represented in the model? What if some but not all children are in the model? What if not all children are known for a family, would it fair to say that it is represented in the model, even if all known children are in the model?

Definitely open to suggestions, but gist is that we wanted a label for species that are in/out of the model that's easy for all users to interpret in the context of "why am I getting suggestions for this species?" or more commonly "why is this stupid vision system not suggesting the obviously correct choice of X. yz?" 😅

Posted by alexshepard almost 3 years ago

Congratulations and thank you!

Posted by conboy almost 3 years ago

Cool. Excited to see at least one species I worked hard on is now included in the model.

When is the estimated cut off date to get stuff included in the next model?

Posted by rymcdaniel almost 3 years ago

Huzzah!

Posted by eebee almost 3 years ago

Thank you soo much,now we can gain more knowledge about wildlife
Congratulations INat Team ,hats off to you.

Posted by karthikeyaeco almost 3 years ago

@rymcdaniel cool, which one? also, I don't have a cut off date planned, but I hope to start training sometime in August.

@alexshepard Gutierrezia texana. One of the DYC's in Texas. Commonly misidentified as Amphiachyris dracunculoides. Being in the model won't eliminate that, but hopefully it will help some.

As for the cutoff date, I have a lot of updates to IDs planned on a genus where I am seeing 10-20% error rate in RG observations. I was hoping to get the identification updates in before the next model starts, but don't know if I will get to it. Probably depends on when in August you get started.

And BTW, thanks for all the work of course. I am excited to what other unexpected things I might find in the new model.

@alexshepard -
Ahh ok, I see that could become a bit confusing.
For me I think just a bit of blurb explaining as you describe… that it may mean that the family has children that are represented in the model… or it may not…could suffice. Like the pending blurb encouraging people to fill in the blanks (a really nice addition I think), I think it could potentially encourage others to dig deeper. In larger complex groups like Chalcid wasps we are lucky to get to subfamily even, but it's still nice to be able to monitor and help chip away at the granularity.
I get there could be a trade off though with clarity for some users.

Tests so far - chuffed to see Sarcophaga carnaria no longer in the model! Also nice to see some common UK diptera I actively tried to increase observations for, now included and working. Great to have a sense of achievement with this stuff.

Interesting to see the keynote, thanks for sharing :)

Are there any plans to include subspecies soon?

Thanks so much for your hard work on the new model! Already last night I was testing it out on my older observations stuck at order and higher and pleased to see that the model had plausible suggestions for many of them.

Posted by matthias55 almost 3 years ago

Is it possible to download a list of taxa included in the model? API? Just to learn easily which species need additional records

Posted by apseregin almost 3 years ago

So wonderful -- excellent job iNat staff! :)

Posted by sambiology almost 3 years ago

I'm curious as to why the training is being done with on-premises hardware at all. I would have thought using a cloud service (AWS, Azure, etc) would be more effective and/or cost-efficient. Is there something specific that prevents this? @alexshepard @pleary

Posted by mtank almost 3 years ago

Thank you iNat team!

Posted by anudibranchmom almost 3 years ago

This is great! Can't wait to see where this goes next!

Posted by nathantaylor almost 3 years ago

@mtank using GPU-accelerated virtual machines hosted on cloud providers may be cost effective on a smaller-scale (if you don't need it to run for too long). It can save money on purchasing hardware and setting things up yourself. But if running for a long period of time, as our CV model train runs tend to be, it is cheaper for us to run our own hardware.

When running our own hardware we also retain full control over the setup, which can be a plus when we want to try a new GPU for example. Nvidia has been generous in donating GPUs to iNaturalist and the California Academy of Sciences, and the only way we can make use of them is in non-cloud servers where we control which GPU its using, and we can ensure the machine has other hardware necessary to take advantage of the GPU. We have tried training in the cloud before, but these are some of the reasons we decided to use our own hardware and we think it'll cheaper, easier, and more customizable for us going forward.

Posted by pleary almost 3 years ago

Great job, iNat team! Thank you!

Posted by navin_sasikumar almost 3 years ago

Good explanation, thanks @pleary :)

Great! I see several new species of moths included in this update. But we still need more moth enthusiasts on Inaturalist...
https://www.inaturalist.org/journal/imbeaul/archives/2021/02

Posted by imbeaul almost 3 years ago

How do you proceed for insects with different life stages? I suppose that you need to create a model for each life stage? Do you still use only the first picture of an observation if a given observation has several pictures? So many questions - but thanks for your amazing work!

@imbeaul There is only one model for the whole site, and it can automatically recognise different life stages/parts/angles of organisms, so long as it has data (images) to use as a reference. For example, if you use 1000 pictures of a sparrow to train the model, it will recognise a sparrow. If you instead use 1000 pictures of sparrow eggs, it will recognise sparrow eggs as sparrows, but won't recognise a sparrow as a sparrow. However, if your images are a mix of birds and eggs, it will understand that both should be identified as sparrows, without needing to be specifically instructed. Of course it doesn't (necessarily) know the concept of "egg" or "bird". Its' more like "Someone has shown me a 'thing' that is the same as another 'thing' that I understand to be 'sparrow'"

@imbeaul It has to be one model. Otherwise for plants we would need whole plants, flowers - males or females (and closeups of specific organs), fruit, branches, leaves, stems, thorns, buds, and galls, deformities, diseases, etc. From some identifications the model also uses habitat (or background) - for instance a pile of rocks will invoke Dassie or Klipspringer.

Congratulations on getting the new model deployed, and thanks for the informative description of the work, here!

Posted by tpollard almost 3 years ago

that's great, I played around with the new model a bit and it's working way better for some more obscure taxa like springtails now!

Posted by alexis_orion almost 3 years ago

What a great job!
Undoubtedly, Inat is a great tool that will allow (I hope) to reconnect us with nature and protect it from knowledge, since we love only what we know.
Thank you very much!

Qué gran trabajo!
Sin duda, Inat, es una gran herramienta que permitirá (he de esperar) re-vincularnos con la naturaleza y protegerla desde el conocimiento, ya que, se ama solo a lo que se conoce.
Muchas Gracias!

Posted by orlandomontes almost 3 years ago

Well done! Thank you for the excellent work, and for this update.

Posted by tsn almost 3 years ago

Bravo! The iNaturalist concept and tools are great and keep on getting better. I mostly submit insect observations, and I try to improve my macro techniques to provide the best photos I can. In particular I often take several photos of a live insect from different angles to better document the observation (recent example: https://www.inaturalist.org/observations/87187861). Preferably I use a light table, in the "Meet your neighbours" style promoted by Clay Bolt and others.
Is this type of observation, confirmed at the species/subspecies level, better for training the computer vision model ?

Posted by cback almost 3 years ago

Congratulations! It would be nice to have an computer vision icon next to the taxon name in the Taxonomy tree

Posted by gancw1 almost 3 years ago

Excellent news! Big thanks to the devs.

Posted by tkoffel almost 3 years ago

Thanks so much y’all!

Posted by lightning_whelk almost 3 years ago

I've made some counts. Currently, the model covers 36% of the Russian flora. Not so much, but... It covers 94.5% of observations! (and 99% in the most regions of European Russia). That's totally incredible! Thanks everyone for this great toy and joy!

@apseregin - how did you do the counts? Great figures.

@tonyrebelo, here is the post with details: https://www.inaturalist.org/projects/flora-of-russia/journal/54290-novaya-model-avtomaticheskogo-raspoznavaniya-vidov
For this, I've used utils by @kildor available at https://kildor.name/inat/

This is really nice. Be great to be able to easily figure out these stats for our geographies & taxa of interest.
There's been a really noticeable jump in autosuggested ID accuracy in UK diptera I think.
Would be curious to know the model %s covered before and after.

Great work! Excited to see how the AI handles observations from now on :-)

Posted by calebcam almost 3 years ago

Would love to see stats on how many species of various groups are included in the computer vision model. For example, what percentage of bird species? What percentage of insect species? etc.

Posted by zygy almost 3 years ago

@cback I'm sure high quality photos like yours always help in some way

Posted by thebeachcomber almost 3 years ago

Great work and can't wait to see the new model in action!

Posted by bookworm86 almost 3 years ago

Great Job..

Posted by manojkmohan almost 3 years ago

@cback Your photos are definitely helpful for humans. They may or may not be as helpful for the computer vision system, but don't let that stop you! As we mention in our FAQ (https://www.inaturalist.org/pages/help#cv-fail), most photos on iNaturalist were taken with a smartphone of an organism in situ. The visual features that our model will learn from pinned or macro insect photos may not be applicable to a photo taken of the same insect with a typical smartphone out in the wild. With that said, photos in iNaturalist are welcome and useful even if they aren't perfectly optimized to improve our computer vision system. For example, a close-up to show an interesting feature of a particular individual is completely relevant to iNaturalist even if it may not help train our computer vision system.

Researchers use iNaturalist data for many purposes, including computer vision. As long as your photos are licensed for research, they will be made available to researchers via our our open dataset program, for example. Who can say, perhaps someday a researcher will use your macro photos to learn something important about insects, via computer vision or another research technique!

Thank you! I imagine this took a lot of dedication and hard work to accomplish, and as an avid iNat user who learns a great deal from the computer vision suggestions, I am grateful!

Posted by ocean_beach_goth almost 3 years ago

Thanks for all the many hours of hard work you've put into this new model! I can't wait to play with it :)

Posted by weecorbie almost 3 years ago

Awesome! Thanks for the hard work.

Posted by kierandh almost 3 years ago

@alexshepard Thank you for taking the time to reply. You confirmed what I suspected, and I find it logical that the computer vision model addresses the needs of the larger group of users: suggesting IDs for live organisms photographed in their environment with basic equipment. It has to be fun, rewarding and simple. What I conclude from your comment is that I could best contribute by providing both in situ and technical photos of the same specimens, the former type providing good references for the community and the learning algorithm of the model, and the latter helping senior identifiers confirm the IDs, especially when there are look-alikes. This post rekindled the questions I had reading a post in another forum ( https://forum.inaturalist.org/t/secrets-to-good-macro-photography/24026/51 ). "GOOD" may be different for each type of stakeholder in the iNat community. We are living concerning but also exciting times, and I'm grateful that the iNat community allows me to contribute to solutions.

awesome

Posted by sushantmore almost 3 years ago

@alexshepard Are there ideas or ambitions to add a level of self-evaluation to the learning process of the algorithm? As in, when initial, CV-supported IDs for a certain species are subsequently corrected by other users in, say, more than 50 or 60% of the cases, that in the next round the CV will be more hesitant regarding that species and rather suggest a higher taxon level.

I am happy now to see Sarcophaga carnaria not longer included in the model, but this happened only because there are now not enough total observations on species level anymore. But for other flies, like Condylostylus patibulatus or Drosophila melanogaster, in presumably more than 9 out of 10 observations the CV suggestion was wrong, yet there are so many (correct) IDs on species level having RG, that these taxa will stay in the model.

Thus, a level of self-criticism for the algorithm would improve its performance in my opinion (as a matter of fact, that would apply for many human beings as well...😆)

Posted by carnifex almost 3 years ago

I like that idea. It should be relatively easy and not a burden to record any cases of the AI being used, its top few options, and the taxon chosen. Slightly more intensive would be to monitor subsequent community ID.

How one incorporates this information into the AI training though is quite another can of worms.

I find the AI scarily accurate in many cases. And I find its mistakes instructive as well: it provides insights into pitfalls and problems with identification from photographs made by humans. It is a pity that we cannot evaluate the AI in terms of human development: I would rank it well into postgraduate level, approaching novice specialist in many groups (but then I dont work in the really difficult groups like beetles, flies and moths - or impossible ones like earthworms and lichens) .

Posted by tonyrebelo over 2 years ago