Flagger | Content Author | Content | Reason | Flag Created | Resolved by | Resolution |
---|---|---|---|---|---|---|
bouteloua | Fisher (Pekania pennanti) |
computer vision clean-up? |
Oct. 30, 2020 19:12:15 +0000 | bouteloua |
see comments |
@maxallen @jwidness @oliversw @jonpoppele @birds_bugs_botany fyi
I took a quick look at some recent fisher observations, including a large number of snow tracks. Quite a few were misidentifications--but this is not unusual for tracks in the snow. Some of these were clearly identified by the user, not by computer vision suggestions. For others it was not clear to me. If there are any particular snow track photos you would like me to look at, let me know. I'm happy to do so.
I just received a note from someone who had a set of squirrel tracks in snow identified as fisher by the CV. I've noticed that fisher usually comes up as the first suggestion for any tracks in snow that I upload.
So this issue is persisting, but it seems that most people are ignoring or at least second guessing the fisher ID for tracks in snow.
I have been looking into this again recently.
One of the big problems here is that artificial intelligence (AI) or machine learning or computer vision or whatever you want to call it needs a training dataset of images. The computer uses this set of "accurate" data in order to predict the species for new observation. However, if the training data is no good, neither will the predicted species for new observations.
The problem with fishers is that there are a lot of misidentified tracks, mostly in snow. It seems one user in particular (@oliversw) is marking every track in snow as fisher, even when it is obviously other species like raccoon. This seems to be causing the computer vision to assume every track in snow is from fishers. This is easily corrected over time by people that can accurately identify fisher tracks going through and correcting everything, but will be an ongoing problem as long as people continue to misidentify fisher tracks.
This inaccuracy is one of the common problems of using community science data. There's not much to do about it, besides trying to educate users (although some will not be interested). It is causing a few problems for me now, as we are trying to use iNaturalist data to create a habitat suitability model for fishers across North America. My collaborators view these inaccuracies as proof that animal tracks cannot be identified accurately enough to be used in scientific studies, which is a disservice to people who take the time to hone their tracking skills. The only solution seems to be to double-check all fisher observations, especially those that are already research-grade and being used by the computer vision to predict species identifications.
@raymie notes that there are currently over 100 observations labeled as fishers, misidentified due to computer vision, often tracks in snow.