A new Computer Vision Model (v2.4) including 1,994 new taxa

We released a new computer vision model today. It has 76,129 taxa up from 74,135. This new model (v2.4) was trained on data exported last month on May 21st and added 1,994 new taxa.

Taxa differences to previous model

The charts below summarize these new taxa using the same groupings we described in past release posts.

By category, most of these new taxa were insects and plants

Here are species level examples of new species added for each category:

Click on the links to see these taxa in the Explore page to see these samples rendered as species lists. Remember, to see if a particular species is included in the currently live computer vision model, you can look at the “About” section of its taxon page.

We couldn't do it without you

Thank you to everyone in the iNaturalist community who makes this work possible! Sometimes the computer vision suggestions feel like magic, but it’s truly not possible without people. None of this would work without the millions of people who have shared their observations and the knowledgeable experts who have added identifications.

In addition to adding observations and identifications, here are other ways you can help:

  • Share your Machine Learning knowledge: iNaturalist’s computer vision features wouldn’t be possible without learning from many colleagues in the machine learning community. If you have machine learning expertise, these are two great ways to help:
  • Participate in the annual iNaturalist challenges: Our collaborators Grant Van Horn and Oisin Mac Aodha continue to run machine learning challenges with iNaturalist data as part of the annual Computer Vision and Pattern Recognition conference. By participating you can help us all learn new techniques for improving these models.
  • Start building your own model with the iNaturalist data now: If you can’t wait for the next CVPR conference, thanks to the Amazon Open Data Program you can start downloading iNaturalist data to train your own models now. Please share with us what you’ve learned by contributing to iNaturalist on Github.
  • Donate to iNaturalist: For the rest of us, you can help by donating! Your donations help offset the substantial staff and infrastructure costs associated with training, evaluating, and deploying model updates. Thank you for your support!
Posted on June 22, 2023 09:08 PM by loarie loarie

Comments

The inclusion of Didymium squamulosum in this version of the CV model seems concerning given the extremely similar (but quite rare) Didymium karstensii

Posted by ethansaso 10 months ago

@ethansaso Can you share more about the concerns you have about its inclusion? Do you think there will be a lot of false positive/negative observations?

Posted by cofa 10 months ago

A lot of false positives. It's a problem with several myxomycete species included in iNat's CV, such as Hemitrichia decipiens, which the model seems to recognize based on its "salmon egg"-like appearance while immature despite several other species sharing this feature (H. clavata, H. calyculata, Trichia crateriformis, etc.). Many photos (some of which I'm assuming were used to train the model) also don't show enough detail for conclusive identification

Posted by ethansaso 10 months ago

@ethansaso Thanks for the context, this is helpful for understanding.

H. clavata, H. calyculata, Trichia crateriformis

Do these taxa not have enough observations to be included in the CV model? And if so, would this be less of a concern if they were in the CV?

Posted by cofa 10 months ago

In most cases, even a very well-studied human can't differentiate between these species given just a photo of the immature sporocarps, let alone iNat's CV. Without high-magnification diagnostic photos of mature sporocarps and/or spore/capillitium micrographs, the most accurate identification for these "salmon egg" species would just be the entire order Trichiales, given that Hemitrichia and Trichia are in different families.

In general, it is difficult to narrow down myxomycetes to a single species based on macroscopic features alone.

Posted by ethansaso 10 months ago

Yay! Penstemon guadalupensis...a plant that I've been studying has been added!
@bosqueaaron
Next spring we'll see how accurate it is.

Posted by pfau_tarleton 10 months ago

Again an update in time. Many thanks. 2513 is a lot of new taxa. I would expect the number of taxa's added will be lowered each release but the opposite is true, so apparently the model is still growing exponential.

@ethansaso If you don't agree then please vote for the order with option don't agree then it will be removed from the model.

Maybe inaturalist should keep track of species that cannot be determined based on photo's alone. This could help to work out wrong or dubious identifications quicker.

Posted by rudolphous 10 months ago

I think it would require a lot more staff to keep track of the potentially thousands if not millions of species that cannot be determined by photo alone (and who will make that determination?).

Posted by pfau_tarleton 10 months ago

Maybe there could be some sort of process by which a taxon is flagged as being unable to ID based on typical macroscopic photos. Curators would review the flag and approve (or deny) it. If the curators approved it, then the result would be that specific taxon would not be suggestible by the next version of the CV. Obviously, folks could still add that ID manually if they chose (if they had genetic data, microscope slide photos, or other strong evidence).
I think it gets tricky because of iNat's worldwide scope. In some regions, the CV suggestion may be entirely appropriate, but in other regions it would be wrong. For example, I am working on fixing about 1800 observations of Morus nigra in the United States, of which probably 98% are incorrect IDs. More are added every day. So this would be a case study for flagging the taxon and have the CV not suggest Morus nigra, at least in the western hemisphere. (I would imagine that the CV would still suggest Morus, and list morus alba and morus rubra among the possibilities.) I think that would greatly reduce at least the new additions of incorrectly IDed observations. But it may be appropriate for the CV to suggest it for its native range (or at least that is conceivable - I don't really know the species well enough to say).
I feel like this is pretty well-worn territory though and I'm sure people have suggested similar ideas.

Posted by matthias55 10 months ago

If you don't agree then please vote for the order with option don't agree then it will be removed from the model.

@rudolphous I'm not sure what this means, could you explain?

Posted by ethansaso 10 months ago

@ethansaso I believe they meant you should disagree with the species level IDs on each observation, to knock them out of RG status and back to a higher level of classification. If enough are removed, it should fall out of the CV model next time. The problem there would be if there are enough observations where the diagnostic criteria ARE present, it may still qualify anyway.

Posted by imwolfe 10 months ago

@matthias55 I meant indeed go to the page of the observation and disagree on species level en put it on a a higher taxon by proposing this alternative ID exactly like @imwolfe explained.

@imwolfe Good point if enough qualifies it will be in the CV model anyway indeed.

Posted by rudolphous 10 months ago

It is great to see that the black-footed ferret (Mustela nigripes) is on the list. I'm a little surprised that they weren't on earlier but I guess they are so rare that it is hard to get good photos. I still love it. Every upgrade to the CV model is heartening.

Posted by kevintoo 10 months ago

Rather than having people decide which species can't be identified via image recognition, maybe report the 0.00 - 1.00 score with the result to give the user an idea of the level of model confidence? I have found this useful for identifying weak spots during model development, and I think it would also be useful for users. The downside is that the model might give a high score when the image is of a very uncommon species with little data and there is a similar species with lots of data. The training data imbalance might mess things up. But I would still like to see the scores.

Posted by erniem 10 months ago

@ethansaso That looks great! Now I just have to figure out how to install it ;)

Update: The extension only work with Google Chrome (I think?). I was using Firefox, but I switched to Chrome and it is working for me. It's a great addition. Thanks for showing it to me.

Posted by erniem 10 months ago

Add a Comment

Sign In or Sign Up to add comments