Update on Taxon Frameworks

iNaturalist works best as a tool for helping people collaborate around species identification when there is clarity about the taxonomy everyone is referring to. This also makes it easier for iNaturalist curators to maintain the iNaturalist taxonomy when it's clear what direction they should be curating in. Five years ago we introduced Taxon Frameworks as a tool to help provide this clarity by explicitly referencing taxa on iNaturalist to external taxonomic references.

Since then, the number of Taxon Frameworks with external references has increased. The World Registry of Marine Species is now being used for nearly all Animal Phyla outside of Arthropods and Chordates. All Chordate taxa are linked to references such as the Reptile Database and Catalog of Fishes. Plants of the World Online is the reference for Vascular Plants. We still have no taxonomic references for Kingdoms such as Fungi and Chromista.

For Arthropoda, the vast majority of groups lack references. But most of the few arthropod groups we did have references for such as Mantids and Phasmids were referencing the Species File Group. This month, the Species File Group migrated to a new system known as Taxon Works. As a result, we were able to rewire these taxon frameworks up to their new home and add a few more branches on Taxon Works such as Harvestmen, Grasshoppers, and True Hoppers.

The pie chart below shows these taxonomic groups by percent of observations. 68% of these observations (colored areas) now have taxonomic references. This is thanks in part to complete coverage of Vascular Plants (green - 39%) and Chordates (blue - 20%). Of the other large group, Arthropods (24%), only 23% have references. The 3 large insect orders Lepidoptera, Coleoptera, and Hymenoptera account for most of these Arthropod gaps. At 6%, Fungi is the other large gap in taxonomic coverage.

We’re very excited that Taxon Works is providing a platform that makes it easier for taxonomic providers like the Species File Group to maintain and share their taxonomies. We hope these tools will facilitate progress filling gaps like Lepidoptera and Fungi in the coming years. iNaturalist derives huge benefit from access to well maintained global taxonomies. Our sincere thanks go to those maintaining the taxonomies the Taxon Frameworks depend on such as:

Chordates

Arthropods

Other Animals

Vascular Plants

*Managed in an instance of TaxonWorks

Posted on October 12, 2023 10:38 PM by loarie loarie

Comments

Great news. It would also be nice if iNaturalist updates the Catalogue of Life source used for the Taxa search bar. If I'm not mistaken, the note above the bar mentions that the source used currently is still the Catalogue of Life 2012 Annual Checklist.

Posted by po-po-pro 7 months ago

wonderful! hopefully, servers willing, we can soon add superorders to birds, infraorders to passerines, and suborders to carnivora!

Posted by marcelllo 7 months ago

The Bryophyte Nomenclator (https://www.bryonames.org/ ) is the only comprehensive source of taxonomic data for Anthocerotophyta, Marchantiophyta and Bryophyta available online. Currently, it is fully integrated into the Catalogue of Life (CoL) as the basic reference for mosses. The list of the datasets used for CoL building is available here https://www.catalogueoflife.org/data/source-datasets
Some of them can be selected as our taxonomic backbones here (or just CoL for those groups which are not excplicitly covered by other sources on iNat)

Posted by apseregin 7 months ago

I can not express enough how helpful iNaturalist is for sorting out taxonomic structure of species. It has helped me learn about the species at the sites I survey because I can see where they fit.

Posted by bellskimmer 7 months ago

Much gratitude to the people who do the work to make it possible.

Posted by bellskimmer 7 months ago

This must have been a herculean task. It is deeply appreciated. Outlining what percentage of identifications/observations specifically benefit from the Frameworks is also really helpful.

This is only somewhat related, but if one of these sources in the Taxon Frameworks was to carryout a taxonomic revision that would affect groups or specific species with lots of identifications/observations (>1000), would it still be destructive (or at least cause a lot of slowdown) for curators to integrate these changes on iNaturalist?

Posted by bobby23 7 months ago

oh wow amazing work, sadly fungi is lacking tho

Posted by intelec 7 months ago

Even though we lack a taxon framework for fungi, the iNaturalist curators have been doing an amazing job keeping fungus taxonomy up to date. I've heard from several people that the most up-to-date fungus taxonomy on the internet is the one on iNat.

Posted by zygy 7 months ago

yes, you're right

Posted by intelec 7 months ago

I don't really understand why iNaturalist is doing all this work to go it's own path when Catalogue of Life[1] is also doing this and references the same (and many more databases). Why can't we just "do what COL" does. They have poured soooo much meticulous effort into aggregating different datasets and they currently stand at 163 datasets[2]. And they even have an API available

[1] https://www.catalogueoflife.org/

[2] https://www.catalogueoflife.org/data/source-datasets

Posted by common_snowball 7 months ago

@common_snowball - As someone who has extensively used the COL API for taxonomic data, I can tell you why... COL's goal is completeness, not accuracy. Those 163 datasets that they use span a broad gamut of quality. Some of them are actively curated by academic institutions and are very high quality (e.g. World Spider Catalog), while others are not kept up to date (e.g. Species Fungorum), and may even be published by individual people (e.g. LWS fleas). iNaturalist uses a similar approach, but just has a much higher standard for dataset inclusion. Of course this might mean that some areas of iNaturalist taxonomy are not as complete or accurate, but from my anecdotal experience, iNaturalist seems to usually have better quality taxonomy, at least when it comes to groups that are commonly observed by the public (rather than things like bacteria and viruses). There's also the issue that the COL API is still in beta testing and doesn't seem to be able to support high loads. Hopefully one day COL will be a viable option, but I just don't think it's there yet.

Posted by zygy 6 months ago

@zygy That makes sense! I appreciate the insight and explanation.

What drives such a high bar for accuracy in iNaturalist? Is there something irreversible that happens with taxonomic changes? Like is there any harm in using COL for completeness and then slowly building out and modifying things as you find higher quality datasets?

Also, if we're maximizing accuracy, I feel a dataset like OpenTreeOfLife is a great fit. The world's largest phylogenetic tree. It integrates not just datasets but gets granular enough to work with individual studies. The API is decently robust too. I suppose the main challenge is mapping taxonomic categories onto phylogenetic samples. They don't support taxons that are proven not to be monophyletic so many of the genera, families, etc used by both iNat and COL won't always have direct mappings

https://tree.opentreeoflife.org/opentree/argus/opentree14.9@ott93302

Posted by common_snowball 6 months ago

@common_snowball Perhaps the problem is that COL depends on too few people / data sources to keep up with such an enormous task. I don't think it is possible to maintain a global taxonomy without crowdsourcing the work to local experts. iNaturalist has an army of users suggesting taxonomic changes every day, whereas it can take COL many years. I would love love to use the Catalog of Life API or GBIF taxonomic backbone for lepidoptera, but they are very out of date (2012) and full of miss-mapped synonyms, even for common butterfly species. This has caused us an immense amount of hair pulling when trying to use that index for any analysis or data aggregation. I don't think a "complete" taxonomy is helpful if it is full of errors, especially when you actually want to use it for something.

Posted by mihow 6 months ago

@mihow Errors in taxonomy imo are on a sliding scale. E.g. you might get the wrong genus but there's less of a chance to get the wrong family, an even smaller chance to get the wrong order, etc. The point is that even an incorrect taxonomy can be somewhat useful. I mean just look at some published statistical work in ecology!

Posted by common_snowball 6 months ago

@mihow I am fascinated to have seen your comments here about Lepidoptera. I always get pointed to GBIF when I want to look at global butterfly datasets, but as you say it's ten years out of date. The recent advances in nuclear and mitochondrial DNA phylogenies are causing havoc in the taxonomies of Papilionoidea; so much that it's difficult to keep up.

What makes it most difficult is that most of the players are 'amateurs' like me, or academics who work in silos and are sometimes unaware of the 'big picture'.

I was sent this link by David Roy of the UK Centre for Ecology & Hydrology, who manages the UK Butterfly Monitoring Scheme and its European partners. We've just completed the herculean task of harmonising butterfly (Papilionoidea) taxonomy and nomenclature for Africa between LepSoc Africa's dataset and iNaturalist's, and now have an 'African Butterfly Monitoring Scheme' in its infancy, using the same dataset. As we finished, the man who had done most of the curation work on that dataset, Prof Mark Williams (a veterinarian academic who was also a hugely competent amateur lepidopterist), died. Replacing him is going to prove very difficult but the Society's Council has a plan to do something. Probably, at least in the short term, what's in the published literature and what's in the dataset will lose a degree of harmonisation.

It's even worse for humble field guide authors like me, who must attempt to keep Joe Public up to speed on these fascinating creatures!

And that's just Africa. We've found the biggest taxonomic conflicts occur for specialist taxa that occupy space in other regions, like Europe and south-east Asia.

Posted by stevewoodhall 6 months ago

We got some requests to add AntCat as a taxon framework reference for ants and added that yesterday (and updated the second figure in the. blog post above). FYI @aaron567 @arman_ @peterslingsby @mettcollsuss - let me know if you see anything weird. Here are the species that are active in iNat but not in AntCat in case any curators want to take a look.

Posted by loarie 4 months ago

Add a Comment

Sign In or Sign Up to add comments