20. EOL took 291,000 common names in 279 languages from WikiData

Google started with FreeBase

De soorten gaan van WikiData naar Encyclopedia of Life..en van EOL naar iNaturalist.

Language Support

Language support in EOL v3 is in continuous development, but many features are internationalized. Here's where things stand at the moment:

The interface- navigating EOL in different languages:

Thanks to our collaborators at translatewiki and their corps of volunteer translators, the full EOL basic interface navigation is available in Arabic, Brazilian Portuguese, English, Finnish, French, Macedonian, Piedmontese, Traditional Chinese and Turkish. Read more about becoming a volunteer translator.

Common or vernacular names for taxa:

We have harvested the common names holdings of the wikidata, which include just over 291,000 names in 279 languages. We also have >93,000 common names in 130 languages added by valiant EOL members to fill gaps they observed over the past ten years. You can search EOL by any of these names and find them in the names tab of any taxon page.

Articles:
We have articles in many languages. The article tab has a language filter, which is set to English by default. We hope soon to make that default setting configurable in your EOL profile.

Structured data:

Our valiant translatewiki community

One of the great advantages of structured attribute and interaction data is that it is very efficient to translate. Commonly used data terms like "body length" and "predator" in our taxon page summaries are translated by the translatewiki community, and many place names have translations available from our geographic terms providers, geonames and wikidata. We have harvested the common names holdings of the wikidata, which include just over 291,000 names in 279 languages. We also have >93,000 common names in 130 languages added by valiant EOL members to fill gaps they observed over the past ten years. You can search EOL by any of these names and find them in the names tab of any taxon page.

https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples
https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples#The_Netherlands
https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples#Gender_distribution_in_the_candidates_for_the_Dutch_general_election_2017
From Freebase to Wikidata: The Great Migration
https://static.googleusercontent.com/media/research.google.com/nl//pubs/archive/44818.pdf
https://www.eol.org/docs/what-is-eol/language-support
https://upload.wikimedia.org/wikipedia/commons/4/4a/Biodiversity_Next_conference_poster_on_Wikimedia_and_iNaturalist.pdf

https://www.wikidata.org/wiki/Wikidata:WikiProject_Freebase
https://www.wikidata.org/wiki/Wikidata:WikiProject_Biodiversity

iNaturalist (Q16958215) is a citizen science project focused on biodiversity. It has a large community of enthusiasts, of which some are also active in the various Wikimedia communities. This wikiproject aims at improving the cross-pollination between iNaturalist and Wikimedia communities. Wikimedia Commons is an ideal platform to source iNaturalist with observations while iNaturalist with its high-grade annotations of observations provides valuable references to Wikidata statements. "Research grade" observations are incorporated into other online databases such as Global Biodiversity Information Facility (Q1531570). iNaturalist supports many Wikimedia-compatible licensing options, including CC0 (Q6938433), Creative Commons Attribution (Q6905323) and Creative Commons Attribution-ShareAlike (Q6905942). Snippets from Wikipedia are also used on iNaturalist to describe individual taxa.
https://upload.wikimedia.org/wikipedia/commons/4/4a/Biodiversity_Next_conference_poster_on_Wikimedia_and_iNaturalist.pdf

EOL took 291,000 common names in 279 languages from WikiData
EOL took 291,000 common names in 279 languages from WikiData (20)

Google started with FreeBase

Language Support
Language support in EOL v3 is in continuous development, but many features are internationalized. Here's where things stand at the moment:

The interface- navigating EOL in different languages:

Thanks to our collaborators at translatewiki and their corps of volunteer translators, the full EOL basic interface navigation is available in Arabic, Brazilian Portuguese, English, Finnish, French, Macedonian, Piedmontese, Traditional Chinese and Turkish. Read more about becoming a volunteer translator.

Common or vernacular names for taxa:

We have harvested the common names holdings of the wikidata, which include just over 291,000 names in 279 languages. We also have >93,000 common names in 130 languages added by valiant EOL members to fill gaps they observed over the past ten years. You can search EOL by any of these names and find them in the names tab of any taxon page.

Articles:
We have articles in many languages. The article tab has a language filter, which is set to English by default. We hope soon to make that default setting configurable in your EOL profile.

Structured data:
Our valiant translatewiki community

One of the great advantages of structured attribute and interaction data is that it is very efficient to translate. Commonly used data terms like "body length" and "predator" in our taxon page summaries are translated by the translatewiki community, and many place names have translations available from our geographic terms providers, geonames and wikidata. We have harvested the common names holdings of the wikidata, which include just over 291,000 names in 279 languages. We also have >93,000 common names in 130 languages added by valiant EOL members to fill gaps they observed over the past ten years. You can search EOL by any of these names and find them in the names tab of any taxon page.

https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples
https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples#The_Netherlands
https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples#Gender_distribution_in_the_candidates_for_the_Dutch_general_election_2017
From Freebase to Wikidata: The Great Migration
https://static.googleusercontent.com/media/research.google.com/nl//pubs/archive/44818.pdf
https://www.eol.org/docs/what-is-eol/language-support
https://upload.wikimedia.org/wikipedia/commons/4/4a/Biodiversity_Next_conference_poster_on_Wikimedia_and_iNaturalist.pdf

https://www.wikidata.org/wiki/Wikidata:WikiProject_Freebase
https://www.wikidata.org/wiki/Wikidata:WikiProject_Biodiversity

iNaturalist (Q16958215) is a citizen science project focused on biodiversity. It has a large community of enthusiasts, of which some are also active in the various Wikimedia communities. This wikiproject aims at improving the cross-pollination between iNaturalist and Wikimedia communities. Wikimedia Commons is an ideal platform to source iNaturalist with observations while iNaturalist with its high-grade annotations of observations provides valuable references to Wikidata statements. "Research grade" observations are incorporated into other online databases such as Global Biodiversity Information Facility (Q1531570). iNaturalist supports many Wikimedia-compatible licensing options, including CC0 (Q6938433), Creative Commons Attribution (Q6905323) and Creative Commons Attribution-ShareAlike (Q6905942). Snippets from Wikipedia are also used on iNaturalist to describe individual taxa.
https://upload.wikimedia.org/wikipedia/commons/4/4a/Biodiversity_Next_conference_poster_on_Wikimedia_and_iNaturalist.pdf

EOL took 291,000 common names in 279 languages from WikiData
EOL took 291,000 common names in 279 languages from WikiData (20)

Mmatching iNat taxa to Wikidata items would not rely on strings, as suggested earlier in this thread, but on Wikidata’s and iNat’s IDs. Every Wikidata item has a free-form label in multiple languages. While the text label is subject to change, the numeric ID of the item (what Wikidata calls a “QID”) is persistent. E.g. item Q630829 2 represents Larus occidentalis and it has labels in 47 different languages. Each Wikidata item also contains links to the corresponding Wikipedia articles, when they exist. There are currently Wikipedia articles about Larus occidentalis in 28 language editions, from English to Navajo or Hungarian. Finally, each Wikidata item also contains links to many external databases in the Identifiers section, for example the item about Larus occidentalis links to 21 other databases, from NCBI to eBird etc.

Now, one of these identifiers is the iNaturalist Taxon ID. We created this property a while ago for the purpose of reconciling iNat taxa and Wikidata items and, as far as I can tell, it has been extensively populated in Wikidata (it’s currently used by over half a million Wikidata items 1). The first batch of iNat IDs mapped to Wikidata via Mix’n’Match got imported by a script (since the property didn’t exist at that time), but future edits made in Mix’N’Match should go live on Wikidata immediately. I don’t know how often Magnus Manske refreshes the Mix’N’Match catalog with new data dumps from iNat but we can ask him :)
So why does this all matter? Having an iNat ID—>Wikidata QID mapping means that it’s straightforward to automatically retrieve all the associated Wikipedia articles for a given iNat taxon, irrespective of spelling or variation in the title. How to best ingest these links depends on what works best for iNaturalist, but this is by far the best mechanism to have correct matches from iNat to Wikipedia instead of relying on heuristics that match labels.
Finally, I’d like to make a case for adding a Wikidata link directly in the “More info” list. There’s a lot of information about taxa in Wikidata that would be valuable to iNat users, if the link were displayed alongside GBIF, BOLD, Google Scholar and others.
https://forum.inaturalist.org/t/use-wikidata-to-link-to-appropriate-wikipedia-articles-in-all-languages/5538/16

https://forum.inaturalist.org/t/inaturalist-and-wikipedia/2680/6
https://forum.inaturalist.org/t/inaturalist-and-wikipedia/2680

Every taxon page on iNaturalist—pages about species, genera, families, etc—has an "About" tab that automatically pulls content from Wikipedia, if there is a page for that taxon.

iNaturalist About tab for Hypericum kalmianum:
image|435x500,75%

Ways to help improve iNaturalist taxon pages through Wikipedia

  • Edit, expand, and improve Wikipedia pages about taxa. Anyone can edit Wikipedia! Even you. ;)

    • If you don't feel comfortable yet editing Wikipedia articles directly, you can point out issues on their associated Talk pages, or at the relevant WikiProject Talk page.

  • Add useful identification tips (citations are required)
  • Create Wikipedia pages that don't exist yet! For example, all the red links here are flora found in the Chicago region that don't have a Wikipedia page yet.
  • Add your or others' photos to existing Wikipedia pages – you can search iNaturalist photos by the type of photo license. A helpful website has been created to assist in this task. Only CC BY, CC BY-SA, and public domain licenses are available for use on Wikipedia, but see example text to politely request if users are willing to change their photo license.
  • Do the same for Wikipedia in languages other than English.
  • Bring in vernacular/common names from Wikipedia (if they are sourced) and add them to iNaturalist
  • ?

Relevant feature requests

Basic components of a taxon/species page on Wikipedia

See more detailed example template for a taxon article here.
  • Taxobox (Automatic taxobox, Speciesbox, Infraspeciesbox) – Wikipedia's way of sorting species within their taxonomic hierarchy
  • Intro paragraph
  • Physical description
  • Taxonomy
  • Distribution and habitat
  • Ecology
  • Conservation
  • Human uses/Culture
  • References
  • Taxonbar (super useful, pulls information from Wikidata)
  • Categories

Where to ask questions / learn more

(And hey, how meta, a wiki topic about Wikipedia. Feel free to edit!)

  1. EOL took 291,000 common names in 279 languages from WikiData

Posted by ahospers ahospers, October 24, 2020 09:00

Comments

Thumb

Er gaat daar vast iets gezegd worden over wikibooks. Wikibooks is een zusterproject van de wikipedia voor niet-encyclopedische inhoud. Nou vind ik er zelf ook wel wat in zitten om determinatietabellen eerder als handboek dan als encyclopedie te beschouwen. Net als bijvoorbeeld recepten voor allerlei gerechten, een handleiding "hoe schrijf ik HTML" en rijtjes met voorzetsels die in het Duits de derde naamval veroorzaken.

Het voordeel van wikibooks is dat je er lekker de ruimte kan nemen om in detail te treden. Je kan op dezelfde manier van het beeldmateriaal gebruik maken. En natuurlijk kan je vanaf de pagina op de wikipedia een verwijzing maken naar de informatie op wikibooks.

Posted by ahospers 4 months ago (Flag)
Thumb

https://forum.inaturalist.org/t/use-wikidata-to-link-to-appropriate-wikipedia-articles-in-all-languages/5538/16
Matching iNat taxa to Wikidata items would not rely on strings, as suggested earlier in this thread, but on Wikidata’s and iNat’s IDs. Every Wikidata item has a free-form label in multiple languages. While the text label is subject to change, the numeric ID of the item (what Wikidata calls a “QID”) is persistent. E.g. item Q630829 2 represents Larus occidentalis and it has labels in 47 different languages. Each Wikidata item also contains links to the corresponding Wikipedia articles, when they exist. There are currently Wikipedia articles about Larus occidentalis in 28 language editions, from English to Navajo or Hungarian. Finally, each Wikidata item also contains links to many external databases in the Identifiers section, for example the item about Larus occidentalis links to 21 other databases, from NCBI to eBird etc.

Now, one of these identifiers is the iNaturalist Taxon ID. We created this property a while ago for the purpose of reconciling iNat taxa and Wikidata items and, as far as I can tell, it has been extensively populated in Wikidata (it’s currently used by over half a million Wikidata items 1). The first batch of iNat IDs mapped to Wikidata via Mix’n’Match got imported by a script (since the property didn’t exist at that time), but future edits made in Mix’N’Match should go live on Wikidata immediately. I don’t know how often Magnus Manske refreshes the Mix’N’Match catalog with new data dumps from iNat but we can ask him :)
So why does this all matter? Having an iNat ID—>Wikidata QID mapping means that it’s straightforward to automatically retrieve all the associated Wikipedia articles for a given iNat taxon, irrespective of spelling or variation in the title. How to best ingest these links depends on what works best for iNaturalist, but this is by far the best mechanism to have correct matches from iNat to Wikipedia instead of relying on heuristics that match labels.
Finally, I’d like to make a case for adding a Wikidata link directly in the “More info” list. There’s a lot of information about taxa in Wikidata that would be valuable to iNat users, if the link were displayed alongside GBIF, BOLD, Google Scholar and others.
https://forum.inaturalist.org/t/use-wikidata-to-link-to-appropriate-wikipedia-articles-in-all-languages/5538/16

Posted by ahospers 4 months ago (Flag)
Thumb

ben eindelijk eens begonnen om alle libellenartikelen op Wikipedia te bekijken en aan te vullen. Het pittigste is het toevoegen van determinatietabelletjes, ik heb er nu drie. Voor de families van de onderorden:
http://nl.wikipedia.org/wiki/Juffers
http://nl.wikipedia.org/wiki/Echte_libellen
En al eentje voor de soorten van een familie. Deze was geen kattepis en zal ik eerdaags nog eens fris opnieuw bekijken:
http://nl.wikipedia.org/wiki/Pantserjuffers
Ik ben vooral benieuwd of beginers er met die tabellen uitkomen. Er staan (nog) geen uitlegplaatjes bij...

Posted by ahospers 4 months ago (Flag)
Thumb

Momenteel kan ik met goed fatsoen geen licentie kiezen. Mijn afbeeldingen zijn namelijk vrijer, vrijer nog dan "CC-BY" - in principe "Publiek Domein", maar dat is juridisch niet keihard te maken (zie ook hier) dus wat mij betreft zou een cc0-verklaring dan van toepassing zijn. "Naamsvermelding" vind ik duidelijk te beperkend want dat maakt het lastig/ingewikkeld/onhandig om mijn foto's in willekeurige collages, side-by-side vergelijkingen en wat dies meer zij te gebruiken. Ook ontsieren naamsvermeldingen vaak de opmaak van mooie informatieve publicaties. Als het iemand niet goed uitkomt wil ik ze dus nadrukkelijk vrij laten mijn foto's toch te gebruiken.
De tekst is ten aanzien van Wikipedia iets verwarrend in combinatie met de momenteel aangeboden opties aangezien de optie Naamsvermelding-NietCommercieel-GelijkDelen (CC-BY-NC-SA) nadrukkelijk niet op wikipedia gebruikt mag worden. Zoals het er nu staat zou je anders kunnen denken.
Om hergebruik, ook op bijvoorbeeld Wikipedia, te stimuleren - of in elk geval mensen in staat te stellen hier (op wrn) een licentie te kiezen die bij hun overtuiging past en elders (Wikipedia) ook gebruikt kan worden lijkt het me handig om in elk geval ook de standaardlicentie voor Wikimedia Commons als optie aan te bieden: Naamsvermelding-GelijkDelen (CC-BY-SA)
Als je dan toch bezig bent, waarom niet meteen ook een nog restrictievere erbij? In mijn ervaring zijn veel "echte" fotografen er vooral niet van gecharmeerd dat er met hun foto's "geprutst' wordt en dat daarna (vanwege de naamsvermelding!) hun naam erbij blijft staan. Ik denk dat je veel van zulke fotokunstenaars over de streep kunt trekken met deze erbij: Naamsvermelding-NietCommercieel-GeenAfgeleideWerken (CC-NC-ND)
Of, puntje bij paaltje, gewoon de hele zwik standaardlicenties (inclusief PD of cc0!) met een verwijzing naar deze pagina op CC erbij?

Posted by ahospers 4 months ago (Flag)
Thumb

But regardless, I´m really interested in how the limits of a phone can be pushed, with loupes, clip on microscopes, etc… so this doesn´t really apply in my case. I´m super interested in how iNaturalist opens up recording to the general public…and what the boundaries are for recording invertebrates…and how far they can be pushed. I´m amazed at what detail you can get just with a cheap 10x loupe and an old iPhone 6 for example :- https://www.inaturalist.org/observations/63016192 2

https://www.bol.com/nl/p/60x-loupe-smartphone-tablet-zonder-bevestigingsmateriaal-zwart-gebruik-uw-smartphone-als-microscoop-valsgelddetector/9200000118167598/

For non-smartphone camera discussion, please check out https://forum.inaturalist.org/t/good-cameras-for-nature-shots/1064

For a discussion about smartphone macro lenses, please see: https://forum.inaturalist.org/t/macro-lenses-for-smartphone-cameras/9112

I was definitely the one who pushed for it to be implemented in version 3 of the iOS app. I can’t tell you how many emails we’ve received over the years from people who were utterly confused by the previous workflow. So many people had no idea they could import existing photos into iNaturalist, all they saw was the camera that popped up when they tapped on Observe. So I pushed for the iOS app to use the same workflow as the Android app has for a few years.

Posted by ahospers 4 months ago (Flag)
Thumb

https://forum.inaturalist.org/t/inaturalist-and-wikipedia/2680/6
https://forum.inaturalist.org/t/inaturalist-and-wikipedia/2680

Every taxon page on iNaturalist—pages about species, genera, families, etc—has an "About" tab that automatically pulls content from Wikipedia, if there is a page for that taxon.

iNaturalist About tab for Hypericum kalmianum:
image|435x500,75%

Ways to help improve iNaturalist taxon pages through Wikipedia

Edit, expand, and improve Wikipedia pages about taxa. Anyone can edit Wikipedia! Even you. ;)

If you don't feel comfortable yet editing Wikipedia articles directly, you can point out issues on their associated Talk pages, or at the relevant WikiProject Talk page.

Add useful identification tips (citations are required)
Create Wikipedia pages that don't exist yet! For example, all the red links here are flora found in the Chicago region that don't have a Wikipedia page yet.
Add your or others' photos to existing Wikipedia pages – you can search iNaturalist photos by the type of photo license. A helpful website has been created to assist in this task. Only CC BY, CC BY-SA, and public domain licenses are available for use on Wikipedia, but see example text to politely request if users are willing to change their photo license.
Do the same for Wikipedia in languages other than English.
Bring in vernacular/common names from Wikipedia (if they are sourced) and add them to iNaturalist
?

Relevant feature requests

Use Wikidata to link to appropriate Wikipedia articles in all languages
Use Wikidata for place names and Wikipedia descriptions

Basic components of a taxon/species page on Wikipedia

See more detailed example template for a taxon article here.

Taxobox (Automatic taxobox, Speciesbox, Infraspeciesbox) – Wikipedia's way of sorting species within their taxonomic hierarchy
Intro paragraph
Physical description
Taxonomy
Distribution and habitat
Ecology
Conservation
Human uses/Culture
References
Taxonbar (super useful, pulls information from Wikidata)
Categories

Where to ask questions / learn more

The Teahouse on Wikipedia
Talk pages of WikiProjects, e.g. WikiProject Plants and other descendants of WikiProject Tree of Life
WikiProject iNaturalist on Wikidata

(And hey, how meta, a wiki topic about Wikipedia. Feel free to edit!)

Posted by ahospers 4 months ago (Flag)
Thumb

A SPARQL query will do it, e.g.

SELECT * WHERE { ?qid wdt:P3151 "342395" . OPTIONAL { ?qid wdt:P225 ?wikidataTaxonName. } }

Where 342395 is the iNaturalist Taxon ID. See: query / query results

If you URL encode the SPARQL query and append it to the URL then you will get a SPARQL result with the init ID (if it exists): https://query.wikidata.org/bigdata/namespace/wdq/sparql?query= Note that the iNat developers will likely want JSON, in which case they’ll want to use content negotiation.

@andrawaag also noted:

I am currently working on a DOI resolver for my upcoming Wikicite talk next wednesday. Happy to create a iNaturalist equivalent. If next week is early enough and the developers don’t mind python, I can add iNat identifier lookup to the Wikidata Integrator like we currently do for PMID and DOI. This is the code 1 that would enable that.

Posted by ahospers 4 months ago (Flag)
Thumb

https://colab.research.google.com/drive/19b3rbYpJOXiA8h1OEzJIhJb_IBgn1BA8?usp=sharing
from wikidataintegrator import wdi_core
from pprint import pprint
query = """
SELECT * WHERE {
VALUES ?iNatId {"342395" }
?qid wdt:P3151 ?iNatId .
OPTIONAL { ?qid wdt:P225 ?wikidataTaxonName. }
}
"""
results = wdi_core.WDFunctionsEngine.execute_sparql_query(query)
for qid in results["results"]["bindings"]:
iNatItem = wdi_core.WDItemEngine(wd_item_id=qid["qid"]["value"].replace("http://www.wikidata.org/entity/", ""))
iNatJson = iNatItem.get_wd_json_representation()
pprint(iNatJson)

https://colab.research.google.com/drive/19b3rbYpJOXiA8h1OEzJIhJb_IBgn1BA8?usp=sharing

Posted by ahospers 4 months ago (Flag)

Add a Comment

Sign In or Sign Up to add comments