Projects and research grade observations

Projects seem to play an important role in motivating people to contribute and increasing the quality of observations. In order to understand the impact of projects better, I looked for correlations between number of observations, percentage of observations with "research grade", number of observers, number of curators/managers/admins. To be able to compare projects, I chose projects which are concerned with flora exclusively, and which have at least 1000 observations. As of 30.05.2016, there are 11 projects which satisfy these criteria:

Name                            Total obs.   Research grade     Species   People   Curators   Avg. obs/person
 
Angeles National Forest Flora   3092         46% (1415)          417       16       1      193.25
FLORA DE NUEVO LEÓN             5735         63% (3626)         1095       33       1      173.79
FLORA DE MÉXICO                19797         49% (9775)         3376      256       6       77.33
Flora von Deutschland           3793         57% (2180)          776       63       5       60.21
Pocatello Spring Flora          1597         65% (1043)          220       24       1       66.54
Flora de la 
Sierra Madre Oriental           3606         64% (2310)         1151       49       1       73.59
Flora of the Laguna de 
Santa Rosa Watershed            1616         61% (987)           526       12       1      134.67
Flora of Anza-Borrego 
State Park and adjacent desert  2976         56% (1662)          401       74       2       40.22
Virginia Native Plants          1292         38% (487)           511       71       5       18.20
Joshua Tree National Park 
Wildflower Watch                1778         69% (1235)          307       85       4       20.92
Pteridophytes of the 
Northeastern United States 
and Canada                      2502         82% (2058)           77      133       1       18.81

The correlation matrix is:

             Total obs   Research grade     Species      Persons     Curators   Obs/person
Total obs    1.0000000   -0.24824711        0.9658069    0.82048973  0.5043143  0.1101514
Research g.               1.00000000       -0.3290142   -0.02714843 -0.4824401 -0.2192559
Species                                     1.0000000    0.73175997  0.5219708  0.1339048
Persons                                                  1.00000000  0.6265351 -0.4024479
Curators                                                             1.0000000 -0.4452245
Obs/person                                                                      1.0000000

Aside from some obvious correlations (the more observations the more species occurrences etc.) I do not find any correlation between these parameters and the percentage of observations which reach research grade. This is puzzling, as the parameters which are shown on the project pages do not seem (judging by this small sample) to be relevant to the capacity of a group to provide research grade observations.

So what are the relevant parameters? If you have a hunch, please tell me.

Posted on May 31, 2016 09:12 AM by alvarosaurus alvarosaurus

Comments

This is a really interesting investigation! I think it might be worthy to get some other project experts' opinions on the relationship between research grade observations and project participation. @charlie @loarie @kueda @cullen @finatic @carrieseltzer @aztekium @mchlfx @silversea_starsong @robberfly @greglasley

As a curator of just a few projects, I recognized the value of the project and the amount of curation. I can't even count the amount of projects that are created without proper curation (defining "curation" as observations being added to a project, ID's added or guidance to the ID's through comments, and curator or expert interaction with participants). One project that I'm fairly proud of is from just a small park in TX: http://www.inaturalist.org/projects/elmer-w-oliver-nature-park. As I was actively curating this project, I'd try my best to add ID's to observations or at least find folks that could give extra guidance to the ID's. I think it made several of the observations 'research grade,' but just as important, there was quite a bit of participation in the project.

I'm a participant in the "Plants of Texas" project (http://www.inaturalist.org/projects/plants-of-texas), and that's got a MONSTER amount of observations (71000), but there are also lots of species of Texas plants... So, without experts in all of TX flora (roughly 5000 species!!!), all of the observations don't get the proper curation simply because of the amount of species and lack of experts.

Other projects, like Herps of Texas (http://www.inaturalist.org/projects/herps-of-texas) get a lot of curation, and it is tremendously successful because of that. Granted, there are far fewer species of herps in TX (a few hundred) than the plants, but there is a lot more interaction between the experts/curators and participants.

So, maybe a conclusion can be drawn: successful projects are smaller in scope. I really love projects on iNat, but I think it's important to emphasize a correlation between "success" and "curation." :)

Would love to hear what other project curators think of this.

Posted by sambiology almost 8 years ago

Very interesting. Most of the projects that I have created fall into two groups:

Wide open to taxa
I just am interested in the overall taxa for a region so I don't yet expect a high percentage of Research Grade observations simply because of a lack of expertise for many of the taxa. These are favorite regions of mine that visit as often as possible and I want to highlight all the taxa in these regions for others. Plus it helps me easily find what I haven't yet found in those regions.

Difficult taxa
These are taxa for which I am interested but are difficult to get an ID. Robberflies fall into this as go many gastropods. Even the top experts have a difficult time coming to a consensus with some of the species in these groups. The purpose of these projects is to hopefully learn to better identify them by getting people with this interest together into the project and sharing their knowledge.

From what I have seen of herp projects, they tend to get a higher degree of Research Grade observations. RASCals the project is high among them, but that is probably due to the curator of the project looking at almost each and everyone of the observations, and he has the knowledge to identify all of them.

I think the purpose of a project needs to be considered before even starting to decide if a project has reached the point of being a success or not.

Posted by finatic almost 8 years ago

We stopped requiring that users explicitly add their observations to projects over a year ago through the introduction of the project aggregator, and we've been gradually opening up the aggregator to more an more people ever since, so number of project observations is not a good metric for how well projects motivate people to add more observations or identifications. It just tells you how many observations meet that project's criteria.

We do track who adds observations to projects so I could provide you with some data that might help you understand if explicitly adding observations to certain projects leads to better data quality outcomes, if you can tell me what you need. I suspect it doesn't, though, since precious few projects have active curators who are providing additional value (there are exceptions like Herps of Texas and Vermont Atlas of Life, but compare them to more typical projects like http://www.inaturalist.org/projects/clarkia-biogeography, where the project admin has added a grand total of two identifications). I suspect the probability of achieving Research Grade has much more to do with the taxon you observed (birds are easier than plants), the kind of photo(s) you took (sharp and identifiable vs. blurry and hopeless), the kind of social network you have (are people following you, how much expertise do they have), and where the observation was recorded (California and Texas probably have higher RG percentages than Montana...).

Posted by kueda almost 8 years ago

Just my 2cents. I understand the value of projects and why some like them so much. Still, short of going through the seemingly many hundreds of projects (how many are there, anyway?) I often don't know what projects should be entered by what observations. I find it quite tedious to go through a day's observations and add each one to the various projects (yes, I know how batch edit works but still tedious to go down the list and check all the right boxes). I, for one, would welcome the opportunity to just have my observations automatically added to any and all projects where needed/wanted. I really hate going through and trying to figure out what project to add it to. With birds, dragonflies, butterflies, etc., I have a pretty good handle on the projects where my observation would fit, but when I take a shot of something out of the ordinary for me, I find myself struggling to figure out where to enter it. I have no objection to any of my stuff being added to any project....I'm just growing increasingly reluctant to spend the time necessary to add everything to all the places where it would fit. Just a grouchy old man here.

Posted by greglasley almost 8 years ago

Oh, very neat. My opinion: It's the community that brings great observations, IDs, etc. One can cultivate a local iNat community many ways. One of them is projects, so it akes sense there'd be a correlation between projects and good quantity and quality of observations. That being said I agree with Greg that the projects can get tedious. To me they aren't my primary motivating factor. And if a project has required fields you can just about forget it as far as i'm concerned

Posted by charlie almost 8 years ago

Yeah, I'm with you, Charlie. I just don't enter data in fields at all. I spend hours a day on iNat, but contrary to the opinions of some, I DO have a life beyond iNat and I'm just unwilling to go down a list of required fields to enter to a project. Things like "observer", "date" for required fields and such....give me a break and read the data on the record for gosh sakes! I just ignore the fields and if I can't enter the obs into the project, so be it. Still being grouchy...I know.

Posted by greglasley almost 8 years ago

There are some fields I do fill out because I am interested - many of the GloBI interaction fields because I think that project is neat, natural community (in Vermont) because i have worked with mapping those, and flower phenology fields. But yeah there are some absurd and redundant ones, people just don't bother to learn how the site works.

Posted by charlie almost 8 years ago

Technically, it SHOULD be up to the curator(s) of a project to add any of your observations to the project. Again, projects are a lot of work/time investment -- I strongly believe they are worth it, and I love doing them, but it takes proper curation for them to be effective.

@alvarosaurus -- I'd sure like to see more of your data on projects! It's a great investigation! :)

Posted by sambiology almost 8 years ago

I can't begin to thank you all for all this great information!. I plan to summarize your comments very carefully and look for concepts that could explain more throughly what's going on. I think this investigation is moving very fast in the right direction. Thanks again.

Posted by alvarosaurus almost 8 years ago

Interesting question! I agree with the comments above and will reiterate that I think the activeness of the project creator/curators is a HUGE factor in project observations achieving research grade status. AfriBats managed by @jakob is another example of a project where one person can make a huge difference (I'm a curator on that too but not nearly as active or knowledgeable). Please do share your further thoughts!

I know there are some other people thinking similar things... tagging @yrhe and @anikarenina.

Posted by carrieseltzer almost 8 years ago

Sorry to miss this thread for so long...

Our research to date suggests that observation-level details also have a lot to do with what materials get attention (and therefore attract the interactions that let them become RG). That's just in a general-public sense, though--if you've deliberately cultivated a strong audience/network, particularly one with adequately diverse expertise to support ID on whatever taxa your project involves, then I think the results could look quite different. The effects of project leadership on project outcomes is something we'd like to dig into eventually, but we need a better baseline first.

We've found that the taxon strongly influences likelihood of research grade status (birds are 70x more likely than other taxa to be ID'd to species), which I think is due to the availability of IDer expertise and possibly also the visibility of obs to the "right" people (statistically speaking). We've found that it matters whether the observation was uploaded by PC or mobile app: presumably this typically reflects the quality of the photo and by extension, anyone's ability to help refine the ID: a terrible cell phone shot of anything can be hard to ID to species but those using "good" cameras usually have to upload via PC. The mobile/PC effect may potentially also reflect some influence of the initial ID applied to the observation (if it's just "something" it may get more/less attention than the same image labeled "plant" or "bird") which could impact how much attention it gets. Surprisingly to me, things like location accuracy made little real difference in the way you'd expect--because location accuracy usually reflects the mode of data submission (mobile/PC) and not the diligence of the observer.

Our paper with these results (and more details): http://dl.acm.org/citation.cfm?id=2820063

A natural experiment with marine species suggests the quality and "beauty" of the images strongly impacts how much attention they get. We even tried multiple upload strategies, based on the assumption that visibility may be a confounding factor with the way that the site dashboards/notifications work, but it made no substantive difference. However, it seems likely that training people in good photography techniques would probably pay off.

Things that we think make a difference but haven't yet been able to investigate: the individual observer's activity levels and network (reciprocity is common in social networks), project-level activity and structure (meta-contributions versus no meta-contributions, as Ken-ichi and Carrie discussed), and how people both interact with and interpret the iNaturalist interfaces. I've seen cases where it seems that misinterpretation of either terminology or interface functionality leads to a "wrong" outcome with respect to RG but it's hard to know how prevalent those issues are and therefore how much attention they may deserve.

I bet that if we could run the analyses tomorrow, however, the results would come out something like this: projects on iNaturalist have the most success (as measured in RG observations) when 1) their recruits are or become active iNaturalist members and not just project members, and 2) when they recruit, mobilize, and retain the requisite balance of expertise and skills needed for the project goals. As others have noted, making a project work well requires work.

Posted by anikarenina over 7 years ago

Your paper is just amazing ( http://dl.acm.org/citation.cfm?id=2820063). Thanks so much for making it available. I really like the ethnographic approach (the participant observation) and your math is so much better than mine. I updated my diagramm and referenced the article. Also your comments here are totally fascinating, as you touch upon so many areas: aesthetics, interface design, social interaction wow.

Posted by alvarosaurus over 7 years ago

Thanks, @alvarosaurus! The stats were done by @yrhe who is very good at quant analysis. We actually had to defend the value of the participant observation to the paper reviewers who didn't see why it was relevant--but we would not have known what to examine more closely without spending time in the field with people using the tools.

Posted by anikarenina over 7 years ago

Add a Comment

Sign In or Sign Up to add comments