Clarifying Ancestor Disagreements

What is the Community Taxon?

Every observation with at least one identification has what we call an Observation Taxon. This is the label shown at the top of the observation page and is the taxon that the observations is "filed under" on the tree of life.

The Community Taxon (also sometimes called the Community Identification) is a way to derive a single identification from multiple identifications provided by the community. If an observation has more than one identification, it will also have a Community Taxon. The Observation Taxon will match the Community Taxon unless: (a) the observer has opted out of the Community Taxon, (b) there is an identification of a finer taxon that hasn’t been disagreed with (more on disagreements shortly).

Identifications hang on nodes on the tree of life. An identification adds an agreement with that node and also all of that nodes ancestors back to the root of the taxonomy.


If two identifications are on different branches of the tree of life, they each count as an agreement for the branch they are on and a disagreement for every node on the other branch back to the common ancestor of the two branches.

Each node is scored with the cumulative number of Agreements (i.e. the identifications on it or its descendants), the total number of Disagreements (from identifications on other branches), and something called "Ancestor Disagreements" which we’ll describe shortly.

The Community Taxon is the finest ranked taxon with at least two agreements where the ratio of the number of agreements to the sum of agreements, disagreements, and ancestor disagreements is greater than ⅔.

In contrast, the Observation Taxon will always match the Community Taxon unless:
a) there is just a single identification, then the Observation Taxon will be defined by that identification
b) the observer opts out of the Community Taxon, then the Observation Taxon will be defined by the observers identification
c) there are no disagreements and there is an identifications of descendants of the Community Taxon, then the Observation Taxon will be defined by the finest such identification (because the community likes that a single non-controversial identification being able to ‘move the ball forward’)*

*if that finest identification is of infra-species rank (eg subspecies), the Observation Taxon won't roll forward to that rank from the Community Taxon if that identification was added later (because the community doesn't like what would be Research Grade observations at species rank being rolled forward to Needs ID observations at infra-species rank). However, if the Observation Taxon was initially set at infra-species rank from a single identification, a non-disagreeing identification of an ancestor won't roll the Observation Taxon back to the Community Taxon.

What are Ancestor Disagreements?

So what are Ancestor Disagreements? If one person adds an identification of one node and another person thinks it’s not that but can’t provide an alternative on another branch, they might add an identification of an ancestor of that node. For example, I might add an identification of Seven-spotted Lady Beetle, but you might add an identification of the family lady beetles, which contains that and many other species.

When the Community Taxon was first implemented, any identification made after previous finer identifications in time was implied to be a disagreement with these finer taxa. These ‘implicit ancestor disagreements’ are now labeled as such.

They only disagree with taxa associated with previous finer identifications. Also some bugs were fixed in how the Community ID charts on the observation page handle "implicit ancestor disagreements".

What are Explicit Disagreements?

Because of confusion about whether people were disagreeing or not, we later made ancestor disagreements "explicit". When an identification is made that is an ancestor of the Community Taxon (or the Observation Taxon if there’s only one identification), the identifier is now presented with a choice to indicate whether they are disagreeing with the Community Taxon or not.

If they are not disagreeing, their identification does not count as an ancestor disagreement for the taxon that was the Community Taxon.

And the identification is not labeled as a disagreement:

However, If they are disagreeing, their identification counts as an "explicit ancestor disagreement" with the Community Taxon.

And the identification is labeled accordingly:

Two ways to disagree...

When we implemented this, we thought that ancestor disagreeing should disagree with the entire branch below the disagreeing identification i.e. “I disagree that this is Seven-spotted Lady Beetle and all taxa on the branch between Seven-spotted Lady Beetle and the taxon I have proposed”. Let’s call this the “Branch Disagreement” way to disagree.

We’ve since come to realize that our communication about this was inconsistent and confusing, based on numerous discussions with community members in person and in the Forum. Furthermore, these discussions suggest the community interprets disagreeing as just with the Community Taxon i.e. “I disagree that this is Seven-spotted Lady Beetle but not the whole branch below the taxon I have proposed”. Let’s call this the “Leading Disagreement” way to disagree. We’ve also since realized from the Forum that Leading Disagreement is a more common and less controversial way to disagree than Branch Disagreement.

At the end of this post, we’ll discuss planned changes to improve things moving forward. But for now, let’s try to clarify our communication describing how things are currently behaving to all get on the same page.

Imagine the following sequence of identifications:

Branch Disagreement tallies disagreements as follows:

Which differs from how one would tally disagreements for the Leading Disagreement case:

Notice that this can impact how the Community Taxon is calculated. In this example, Branch Disagreement computes the Community Taxon as Lady Beetles Family:

While Leading Disagreement would compute it as Asian Lady Beetle:

The site is currently assuming Branch Disagreement as it calculates the Community Taxon. We tried to capture the language for the Potential Disagreement question to distinguish "not disagreeing" with "branch disagreeing" as:

To more precisely capture how the Community Taxon was being calculated this could have been worded something like:

Likewise, Ancestor disagreement identifications could have been more precisely labeled something like the following to reflect how the Community Taxon is being calculated.

Planned changes to distinguishing the two ways to disagree

While we hope the above description will help clear up much of the confusion with how iNaturalist is handling explicit ancestor disagreements, we’ve also learned that these two ways of disagreeing (branch and leading) are distinct and both useful. While "leading disagreement" is clearly the most commonly-used way to disagree, we still think that "branch disagreement" is useful, particularly in enabling the community to stop observations from becoming too finely identified beyond where the community can be certain.

We’re working on changes that would enable identifiers to indicate which way (leading or branch) they are disagreeing. The Potential Disagreement prompt will have three questions:

Here the first orange button would mean a "leading disagreement" and the second would mean a "branch disagreement".

Likewise, "leading disagreement" identifications will be decorated as:

and "branch disagreement" identifications will be decorated as:

Apologies for the length of this post, but we hope it clarifies some of the confusion about how the "ancestor disagreement" functionality is currently working and planned improvements to address concerns expressed in the forum.

Posted by loarie loarie, June 14, 2019 21:10

Comments

Thumb

To be properly "balanced" should any ID above the leaf taxon, not also generate this query, whether there is a community ID or not?

So if I post an ID to - say - genus level, ?Should I not be asked if

[a - "I dont know" - or "not disagreeing"] I cannot ID finer - I dont know, but someone else might OR
[b - "impossible to know finer" - or branch disagreement"] - it is not possible (in my opinion) to make a finer ID that that I am posting, and no one else can either.

(The [leading disagreement] will not apply because there is not a specific ID to disagree with.)

I am also assuming that unlike leading disagreements that will only affect observations posted before them (i.e. you cannot post a [leading disagreement] until there is a leading ID to disagree with), that [branch disagreements] should not have this restriction. This poses a problem, in that if I believe that an ID below some level is not possible, I can only post this implicit [branch disagreement] if an ID to a finer level has already been made.

However, allowing the possibility of entering a [branch disagreement] up front will create a burden on adding identifications (an extra "necessary" step), given that 99% of identifications are "this is as fine as I know, and I dont know finer" ( versus "I know that a finer ID is not possible: this is as good as it gets".)

I also worry that it will be misused on lower quality observations. So an out of focus shot, or a shot not showing the ventral surface might be deemed "Not identifiable below Tribe", which is not the same as "the genera in this tribe cannot be identified from morphological features" - no matter how good the observation is. The former should be a data quality assessment, and the latter a [branch disagreement]
We already see it misused in subspecies, by people who disagree with subspecies and so post a disagreement, which would certainly be a [branch disagreement] - and do so in cases where the observation is clearly a particular subspecies.

There are so many pitfalls with such an approach, I wish that these issues could rather be resolved by discussion and by persuading people to change Identifications, than hard coded. But alas, if it were working like that, there would be no need to have this approach.

I would strongly support that a [branch disagreement] should not be allowed unless 50 characters of the ID discussion ("tell us why") is completed. And if iNat designs a reputation system, then only users with an "adequate" reputation in than taxon should be allowed to post a [branch disagreement]. A [branch disagreement] is far more "serious" and requires far greater knowledge and intimacy with the taxon than being able to eyeball an ID and saying "whoa - that ID cannot be correct, but I dont have a clue what the correct ID might be"

Posted by tonyrebelo over 1 year ago (Flag)
Thumb

Thanks for the improvements.

Posted by gancw1 over 1 year ago (Flag)
Thumb

Looks good and makes sense to me, but then I am one who does not support having a pre-emptive option to block later finer IDs as shown on the right side of the andy71 diagram.


Just because I believe that the ID cannot be taken to a finer level, does not mean someone else cannot. They might have a blender that I do not have. Great work, solid improvement, handles the nature of the disagreement without being pre-emptive. I cannot imagine how hard coding this must be!

Posted by danaleeling over 1 year ago (Flag)
Thumb

The differentiation sounds promising. I find myself having to elevate the Community Taxon to family or order because I personally cannot identify an observation to its correct species, but I know others can. For example, I find observations of mantids that are misidentified as Brunneria borealis, but I do not know what the actual species is. Eventually, I will find the time to do the research on the distinctions between nymphs of Tenodera sinensis and other large green mantis nymphs, but until then, it is nice to have the wording changed to distinguish my “leading disagreements.”

Posted by artifore over 1 year ago (Flag)
Thumb

My brain exploded reading "No, I'm not disagreeing". Double negative? Triple negative? If you're not disagreeing, why did you choose a higher-level taxa?

I think it would be clearer if the wording were more along the lines of "I can't tell, but maybe someone else can".

I also am somewhat put off by the "Yes" part of "Yes, I don't think we can be certain beyond Lady Beetles". In my parsing of language, the identifier is not exactly "disagreeing" with the finer ID. That is, they are not saying that the finer ID is wrong. They are just saying that there's not enough evidence showing to be sure. I understand how some people would think of that as "disagreeing", but it seems muddier to me.

I think the choices would be clearer if they were more along the lines of the following. Note that the order of responses is different than in the proposed redesign:

Why did you choose Lady Beetles (Family Coccinellidae) instead of Seven-spotted Lady Beetle (Coccinella septempunctata)?

I am certain that this is not Seven-spotted Lady Beetle (Coccinella septempunctata)

There is not enough evidence for me to be sure that this is Seven-spotted Lady Beetle (Coccinella septempunctata)

There is not enough evidence for anyone to be sure that this is Seven-spotted Lady Beetle (Coccinella septempunctata)

Posted by sullivanribbit over 1 year ago (Flag)
Thumb

One more wording suggestion: maybe use the word “confident” in place of the word “certain” everywhere. “Certain” sounds possibly a little snotty, but also an often impossibly high bar.

Posted by sullivanribbit over 1 year ago (Flag)
Thumb

"3. There is not enough evidence for anyone to be sure that this is Seven-spotted Lady Beetle (Coccinella septempunctata)"

I am note sure that I like this.

There are two issues here, and we need to be clear about which is which.
There is a data quality issue. Based on information available I think that a finer ID is not possible. (note though that other people may possess additional information (e.g. a comprehensive local species list, or a relatively unknown morphological feature that is useful, or even the plant or habitat). Surely that should go under the DQA: Data Quality Assessment, and should not be part of the ID.
A taxonomic issue. ID to a finer level is extremely unlikely. The features used to ID further are not visible ones - morphology or behaviour., and therefore further ID is not possible no matter how good the data provided. This has nothing to do with the quality of the observation, even a perfect observation would be impossible to ID. This belongs into the ID.

So 3 should read:
"3. It is not possible to ID this taxon below the level of Lady Beetles (Family Coccinellidae) now matter how good the observation."

Note that options 1 and 2 are personal assessments, but option 3 is a universal claim: only really competent or experienced identifiers should be allowed to make this call.

Note too that the option applies to all levels below the level posted by the identifier. It assumes that the identifier is capable of identifying for certain that the organism is correct to that level, and also that any ID to any rank below this is impossible. It would for instance, be incorrect to post option 3 at Family level for a species-level ID, when it is possible to ID to tribes or genera.

Option 3 must be accompanied by a justification of some sort (a link to another justified ID would be adequate).

I also have another issue with:
"3. There is not enough evidence for anyone to be sure that this is Seven-spotted Lady Beetle (Coccinella septempunctata)"
If this is strictly true, then it is impossible to the identifier to be certain that it is not Seven-spotted Lady Beetle (Coccinella septempunctata). Therefore the Identifier cannot make this assertion.

The issue in case 3 is not about the ID of the observation to the taxon previously identified (the focus of options 1 and 2), but about any ID below that of the alternative taxon proposed. Case 3 does not deny the specific ID posted (7Spot Ladybeetle), it maintains that any ID is impossible below the rank specified (Family Ladybeetles) [at least, as I interpret how the agreement-disagreement system works - please correct me if I am wrong].

Posted by tonyrebelo over 1 year ago (Flag)
Thumb

I understand the argument about the difference between a data quality issue and a taxonomic issue, but I don't think I agree with (or perhaps just understand) the claim that the data quality issue isn't part of the ID. Let's say there's a photo of a frog whose belly isn't visible, and the only way to tell species A from species B is by looking at the belly. The observer might have called it species A, because they didn't know about the possibility of species B. I bump it up to the genus level because I do know about species B. It's a Data Quality issue because the photo isn't sufficient to distinguish, but how is that not part of the ID process?

(I strongly agree that option 3 should be accompanied by a justification. In fact, I strongly think that any less-specific ID should be accompanied by a justification.)

Posted by sullivanribbit over 1 year ago (Flag)
Thumb

Do I really need to justify why I am unable to ID below genus, but want to make an ID, even though a species has been posted? I am not disagreeing with the ID, just expressing my level of confidence.
I get notification of dozens of IDs to species, where I have identified to subspecies, every day. Not one has ever had an explanation. In fact, I dont want to get get notification. If they explain then I will get a double notification.
((I must post this as a request: please dont add to my dashboard when a type 1 ID to a higher rank is posted. This must, include species IDs to observations at a finer taxon rank)).

Posted by tonyrebelo over 1 year ago (Flag)
Thumb

Thanks for this - I think the proposed system will be a huge improvement on the current way taxon disagreements are both handled and interpreted.

Posted by fogartyf over 1 year ago (Flag)
Thumb

@tonyrebelo, I think the issue of getting unwanted notifications is a separate one. I am always grateful if someone explains why they disagree with my IDs, and I would guess that most people feel the same.

Posted by sullivanribbit over 1 year ago (Flag)
Thumb

Fine, under the assumption option 1 is not a disagreement - then I agree.

Posted by tonyrebelo over 1 year ago (Flag)
Thumb

Thanks for the feedback.

Regarding tonyrebelo's question about whether people should be able to 'preemptively disagree', thats an intriguing idea, but is likely to be controversial (as pointed out by danaleeling) and may introduce unintended issues, so we're hoping to hold off on that for now until we can get the mechanics of 'reactively disagreeing' correct.

Regarding sullivanribbit's question about why add an identification that is an ancestor of other identifications if you're not disagreeing, as tonyrebelo points out some people want to do that, but also when we implied disagreement from coarser identifications, there was a lot of confusion where people who did that for whatever reason weren't aware they were disagreeing.

Regarding tonyrebelo and danaleeling's points that 'branch disagreements' should be rare and done with extreme care if condoned at all. Thats a fair point. I tried to respond in the forum in response to similar concerns:
https://forum.inaturalist.org/t/change-wording-used-by-the-system-when-downgrading-an-observation-to-an-higher-level-taxa/3862/97?u=loarie

My forum comment also attempts to confront a can-of-worms re: an issue psium noted about the leading disagreement plans

Posted by loarie over 1 year ago (Flag)
Thumb

I think I might grasp the distinction between leading and branch disagreement. The last few examples really help. Unfortunately, I'm going to bet that much of this blog post and many of the diagrams are going to be very confusing to the average iNaturalist user. I like the proposed change (I think) but I can't help but think this could be better explained. Thanks for taking it on though!

Posted by andy71 over 1 year ago (Flag)
Thumb

This is a really great journal entry — hopefully it can be ‘nested’ in the help section on ID’s too:
https://www.inaturalist.org/pages/getting+started#identify

I get this question frequently on the weight of various ID’s, so I’d love to share this journal post regularly.

Thanks for the explanation on this, Scott.

Posted by sambiology over 1 year ago (Flag)
Thumb

I agree there are three cases here, I think the proposed wording is still confusing. I also think “certain” is quite a strong word. I suggest instead:

Why are you identifying this as Family Lady Beetles (Family Coccinellidae) rather than Seven-spotted Lady Beetle (Coccinella septempunctata)?

A) I am quite sure this is NOT a Seven-spotted Lady Beetle (Coccinella septempunctata). I think it is another species in Family Lady Beetles (Family Coccinellidae).

B) I do not think it is possible to say, based on the evidence here, that this is Seven-spotted Lady Beetle (Coccinella septempunctata). I am quite sure it belongs to Family Lady Beetles (Family Coccinellidae).

C) The best ID I can currently make for this record is Family Lady Beetles (Family Coccinellidae). I can’t tell if it is a Seven-spotted Lady Beetle (Coccinella septempunctata) or not.

The “decoration” of the IDs would then be:

A) *user_b thinks this is not Seven-spotted Lady Beetle (Coccinella septempunctata)

B) *user_b doesn’t think we can be certain beyond Family Lady Beetles

C) *user_b is not certain beyond Family Lady Beetles

Posted by anasacuta about 1 year ago (Flag)
Thumb

I agree with @sullivanribbit, my brain totally shut down reading the proposed wording in the orange and green boxes.

Posted by pfau_tarleton 9 months ago (Flag)
Thumb
Posted by optilete about 2 months ago (Flag)

Add a Comment

Sign In or Sign Up to add comments