Exploring fine scale geographic patterns on iNaturalist

One of my favorite things to do on iNaturalist is to look at fine scale geographic patterns of individual species. Yellow spotted millipede (Harpaphe haydeniana), for example, has specific habitat requirements and in the San Francisco Bay Area and is restricted to the forests around Mount Tamalpais in the North Bay and the Santa Cruz Mountains in the South Bay with relict populations on San Bruno Mountain and possibly in the Berkeley Hills.

Unfortunately, the way iNaturalist displays obscured observations and inaccurate observations can add noise to these maps which makes it more difficult to see the biogeographic patterns. Remember that for obscured locations iNaturalist displays a stemless marker randomly located within a 0.2 × 0.2 degree grid cell containing the actual location. Similarly, observations may have accuracies that are too inaccurate for the scale you’re exploring.

We don’t have filters in the menu for these yet, but we do have URL parameters that will let you construct URLs that exclude obscured and inaccurate observations from your searches. To ignore observations where the user has chosen to obscure location information (geoprivacy) and also observations automatically obscured via conservation statuses (taxon_geoprivacy), add parameters for geoprivacy=open&taxon_geoprivacy=open. To ignore observations with coarse accuracy circles, include the acc_below_or_unknown parameter and a value in meters. For example, adding acc_below_or_unknown=1000 will ignore observations with accuracy circles with radii larger than a kilometer. Compare the default search for Ensatina (Ensatina eschscholtzii) salamander observations around the Bay Area with one that includes these parameters. From the search on the right, its more clear that Ensatina's avoid urban areas in the South Bay and San Francisco.

Note that the acc_below_or_unknown param includes observations without no accuracy recorded. If you'd like to search for accuracies that below some threshold and exclude observations with no accuracy recorded, substitute the acc_below parameter as shown here. You can also search for observations with unknown accuracies only by constructing a URL with acc=false

As you’re exploring these fine scale geographic patterns, please help mark locations that don’t look reasonable. For example, the presence of forest and mountain loving Pacific Trillium in the middle of Santa Rosa (flat and suburban) looks suspicious.

Mark these observations by voting no to "Location is accurate" under the "Data quality assessment"

Please don’t vote no to "Location is accurate" if somewhere within the accuracy circle could be suitable. For example this Common Cowparsnip with a marker centered on the ocean but with an accuracy circle that encompasses suitable habitat on land.

But if all of the area within the accuracy circle is unreasonable in your judgement, as in the case of the Pacific Trillium described above, it’s fine to mark the location as inaccurate.

In addition to voting no to "Location is accurate", it’s a good practice to also leave a comment in case you’re wrong (which may means its a very interesting observation) or is captive (which means you should remove your "Location is accurate" vote and replace it with a no vote to "Organism is wild").

If an observation has an unknown accuracy and the marker is centered on a very unreasonable location (e.g. this flowering plant in the middle of the ocean) its fair game to vote no to "Location is accurate". In other words, observations with no accuracy recorded should be assumed to be accurate to within a meter or so and otherwise marked as "Location is accurate" = "no".

Thanks for your help curating these data. They're only as useful as they are well curated. Assuming we as a community can keep on top of bad IDs and suspect locations, these maps are becoming more and more interesting each day as they fill in with new observations.

Posted on December 15, 2020 11:44 PM by

loarie

Comments

I think this is a great filter. As someone who semi-regularly comments on people's location accuracy, I've sometimes wondered if the "Location is accurate" checkmark should reset when the observer edits and saves the observation's location. So, resets to no votes.

Posted by muir over 3 years ago

@muir seems like a good idea.

Posted by kitty12 over 3 years ago

Thx @loarie . I also enjoy “mapping” individual species on iNat maps, especially as iNat observations grow. I am fascinated with where species presumably start and stop (I.e., boundaries), especially where urban / natural area locations meet.

Posted by metsa over 3 years ago

"Please don’t vote no to "Location is accurate" if somewhere within the accuracy circle could be suitable." I'm glad you mentioned this. Last week I was going through observations which are casual due to some DQA vote other than captive, and I saw a lot of people down-voting the "location is accurate" because they thought the circle was over-large. Big circles are allowed as long as the organism is somewhere inside them, right?

Posted by arboretum_amy over 3 years ago

Very usefull info, thanks a lot

Posted by fero over 3 years ago

Snap: https://www.inaturalist.org/journal/tonyrebelo/44352-using-inaturalist-data-for-research
Thanks. Most useful information.

Posted by tonyrebelo over 3 years ago

Is there somewhere where these very helpful hints on searches are collected for easy reference? Did I just miss such a collection in the Help documents? Or should I start compiling my own collection?

Posted by lynnharper over 3 years ago

try: https://forum.inaturalist.org/t/how-to-use-inaturalists-search-urls-wiki/63

Perfect - thank you, Tony!

This is great guidance on how to productively search for outlier observations.

I'd love to see a package of interface features specifically to help identifiers to "curate" observations. Part of it could be exposing these filters for obscured and low-accuracy observations. Maybe also automated filters to find unlikely scenarios, such as observations of land species located in marine areas and vice-versa. My wish list would include some logic that uses a digital elevation model to estimate the elevation (with accuracy) for a specific observation. That would help find montane species hiding out in the flat lands.

Beyond that, I'd love to have easy ways to alert iNat users to issues with their observations that would be easy for them to fix. There's a whole host of stuff (implausible locations; huge accuracy circles; missing locations and dates; missing photos) that most users would be happy to fix. Lots of identifiers have their own boilerplate text to paste in for these scenarios, but it would be a lot more efficient if these issues can be prevented during the upload process in the first place ("Your observation shows a location accuracy of 57km. Would you like to narrow the location more precisely?") For existing observations, it would be great if iNat could provide guided processes for users to fix common issues. Explanations and videos are helpful, but a button that says "Location problem detected. Click here to fix." that triggers an actual UI would be much more effective.

Posted by rupertclayton over 3 years ago

This is great. Thank you! Does this work for Identify too? Does it work for Collection Projects? That could be a really neat use of collection projects. (not saying obscured data, etc doesnt have value also but sometimes it's not what i need, either).

Posted by charlie over 3 years ago

In the context of Scott's conclusion of iNat data only being "as useful as they are well curated. Assuming we as a community can keep on top of bad IDs and suspect locations, these maps are becoming more and more interesting each day as they fill in with new observations," I would add to the wishlist for a package of interface features by @rupertclayton. For some locations, people regularly (and erroneously) place the observation marker based on the text search feature in the map when creating the observation. In Alaska, one example where those errors are obvious is Denali National Park where if one searches "Denali", the marker result is Denali peak (with a 982.6 m accuracy circle). The peak of Denali is 20,308 ft / 6,190 m high and barren -- relative to the very diverse flora and fauna of iNat observations that people are erroneously locating there. There are probably many more places where this type of error is less obvious.

My wishlist to curate iNat location data better would be something that results in fewer errors, larger accuracy circles, message prompts... something that was triggered by people searching for a place name and using those coordinates without further revision (such as, manually moving the marker or revising the accuracy circle). As a baseline estimate of this issue, there are currently 806 iNat observations posted on and around the peak of Denali, and I haven't found one yet that appears to be a valid location. That's about 10% of this borough/county's observations. Scott's new parameters (&taxon_geoprivacy=open&geoprivacy=open&acc_below=1000) help, but there are still 502 iNat observations on the peak of Denali that very likely shouldn't be there.

@muir provides another good example of a type of observation where a user could possibly be alerted at the time of creating a problematic observation and fix the error before it happens. And even if this was infeasible, it's clear that he has identified criteria that can identify some problematic existing observations with a high level of confidence.

I realize that the Feature Requests section of the forum already exists to submit and discuss enhancements. But rather than make piecemeal changes, I wonder if there could be some way to capture a whole range of proposed filter criteria that would indicate problematic observations and corresponding suggested actions to assist observers? Right now, about 35% of iNat's 63 million observations are not obviously wild, verifiable and accurately located. Anything that might help nudge that figure down a few points has the potential to add lots of valuable data. Anything that can do this through a reliable and efficient process can improve the satisfaction of new users and their ability to get IDs, and improve the way iNat identifiers spend their time.

I feel that this is fine, but that it only serves a limited amount of identifiers as well. I agree that some more tools to help identifiers do their jobs more efficiently would be worth pursuing. Or to help observers observe better so they will provide more necessary data for good IDs. I feel that sometimes the bigger picture problems get missed when we worry about what could make things better for scientists. Correct me if I am wrong, but isn't our top priority to demystify the natural world? It's late, I'm probably tired and should go to bed, after some more IDs...

Posted by fungee over 3 years ago

The best way to demystify the natural world is with accurate, unambiguous precise data that scientists can use. So long as achieving this does not incur costs, nuisance value or detract from the iNaturalist experience, we should push it as hard as possible. The reality is that 90% of the time people are simply unaware of these issues. With almost no extra effort (e..g allow the smartphone app a few seconds to get an accurate location before taking the first observation - but not at the expense of an exciting event; "check the circle" when adding data on the web) we can achieve acceptable levels of top-class scientific data.

Hi @fungee. I'm not sure whether your concern about serving a narrow purpose relates to @loarie's original posting or the suggestions in the comments below. iNat staff are always pretty clear that the #1 priority is to encourage people to go out and observe the natural world. But some portion of observers and identifiers have an interest in the scientific value of the data that gets created. I think these aims can be pretty compatible, so long as the research purpose doesn't eclipse the public value.

If your thoughts were prompted by any of my suggestions, I should probably make clear that I am not a scientist, and my main aims were to find ways to help new users successfully generate good observations, and to allow identifiers to focus on "higher-value" work instead of adding repetitive comments (e.g. to explain that an observation has an implausible location).

@rupertclayton it was sort of both. @tonyrebelo "The best way to demystify the natural world is with accurate, unambiguous precise data that scientists can use." I don't disagree with that, let me break down what I was trying to get across. I just worry sometimes that the focus on doing what is best for California could be to the exclusion of the underserved places. Like if California looks good then all will follow suit eventually. I don't think it works like that. I like what, Rupert, you were suggesting with "I wonder if there could be some way to capture a whole range of proposed filter criteria that would indicate problematic observations and corresponding suggested actions to assist observers? " I realize that it is somewhat imprecise request, which probably is why it lines up so well with what I've been thinking about what issues I've been having with iNat. I'm thinking about what big changes could happen, but also what little things I can do with the tools that are currently available (why I am here commenting). I largely think that what the iNat team is up to is furthering the broader goals, and doing what you can do where you are with the tools that you have is sometimes the best way, but appreciate when I read that others are also thinking about what it could be for where they are.

Just for the record, I am from one of the "Underserved places" - where we have lots of biodiversity and few people to record it and even fewer people to identify it, and much of it undescribed.

All the more reason to try and squeeze as much science out of every observation as possible. And at the same time try and get anyone and everyone interested - when they have access to computers and smartphones and can afford the data costs.
But, once the data are on it is too late. Any data quality controls must be implemented at capture, and if the observer does not need to be aware of it, all the better.

Great post and discussion - thanks for this!

Posted by twillrichardson over 3 years ago

I just realized this post hadn't been updated to reflect the acc_below_or_unknown parameter that we added 2 years ago. I updated the post accordingly. Since we assume that observations with no accuracy are very accurate and should be flagged if not, this parameter is the best way to exclude inaccurate observations and/or surface observations with no accuracy that need to be flagged.

Posted by loarie 3 months ago

Exploring fine scale geographic patterns on iNaturalist

Comments

Add a Comment