Proteaceae of southern Africa's Journal

Journal archives for December 2022

December 11, 2022

King Protea Genome Published

A high-quality chromosome-scale assembly of the P. cynaroides genome has been published.

Chang, J., Duong, T.A., Schoeman, C., Ma, X., Roodt, D., Barker, N., Li, Z., Van de Peer, Y. and Mizrachi, E. (2022), The genome of the King Protea, Protea cynaroides. Plant J. https://doi.org/10.1111/tpj.16044

The king protea (Protea cynaroides), an early-diverging eudicot, is the most iconic species from the Megadiverse Cape Floristic Region, and the national flower of South Africa. Perhaps best known for its iconic flower head, Protea is a key genus for the South African horticulture industry and cut-flower market. Ecologically, the genus and the family Proteaceae are important models for radiation and adaptation, particularly to soils with limited phosphorus bio-availability.
Here, we present a high-quality chromosome-scale assembly of the P. cynaroides genome as the first representative of the fynbos biome. We reveal an ancestral whole-genome duplication event that occurred in the Proteaceae around the late Cretaceous that preceded the divergence of all crown groups within the family and its extant diversity in all Southern continents. The relatively stable genome structure of P. cynaroidesis invaluable for comparative studies and for unveiling paleopolyploidy in other groups, such as the distantly related sister group Ranunculales. Comparative genomics in sequenced genomes of the Proteales shows loss of key arbuscular mycorrhizal symbiosis genes likely ancestral to the family, and possibly the order.
The P. cynaroides genome empowers new research in plant diversification, horticulture and adaptation, particularly to nutrient-poor soils.

Posted on December 11, 2022 01:20 PM by tonyrebelo tonyrebelo | 9 comments | Leave a comment

December 24, 2022

Getting help from the AI to find Proteas in the Unidentified Backlog

You might have noticed a few older observations of proteas popping out the the woodwork. Many of these are due to the AI identifying these thanks to the ministrations of @jeanphilippeb and his projects, of which https://www.inaturalist.org/projects/unknown-proteaceae is pertinent to us here.

You can find more of @jeanphilippeb's ideas here:
https://www.inaturalist.org/journal/jeanphilippeb/73398-draft-for-creating-projects-for-unknown-observations

So is it working?
Before we evaluate, we have to admit that we have no idea how many Proteaceae are being missed (false negatives). So we can only evaluate the positive identifications, without any notion as to which species, features or situations are being missed.

OK: so to date 822 observations have been retrieved from the forgotten or trashed pile.
I have been through them (you can too: here - https://www.inaturalist.org/observations/identify?quality_grade=casual%2Cresearch%2Cneeds_id&verifiable=any&project_id=152984&place_id=any ) and we have:

  • 137 observations are still not Proteaceae. But
  • 54 are Proteaceae but are either Multiple Observations (with different pictures of many species) or are Proteaceae with correct identifications but conflicting community identifications due to alternative incorrect identifications, or are clearly Proteaceae but some other organism is the focus of the ID (e.g. a sunbird on a pincushion). You can help resolve these by clicking here
    So approximately 14% are False Positives - identifications as Proteaceae when they were not

So 86% correct is quite something. (how many exams did you get 85% for at school?)

What were they?

For southern Africa 300 observations comprized 117 species, with these dominating:
16 Leucadendron laureolum × salignum Safari Sunset and Similar Cultivars
12 Leucadendron salignum Common Sunshine Conebush
11 Brabejum stellatifolium Wild Almond
9 Protea caffra caffra Common Sugarbush
9 Leucadendron rubrum Spinning-top Conebush
8 Protea nitida Wagon Tree
7 Faurea saligna African Beechwood
7 Leucadendron laureolum Golden Conebush
7 Aulax umbellata Broadleaf Featherbush
7 Leucospermum × hybridum Pincushion Hybrids
6 Leucadendron argenteum Silvertree
6 Leucospermum cordifolium The Pincushion
6 Protea repens Common Sugarbush
5 Protea laurifolia Grey Sugarbush
5 Leucadendron xanthoconus Sickleleaf Conebush
5 Protea roupelliae Silver Sugarbush
5 Leucadendron galpinii Hairless Conebush

Note that these are among the most commonly recorded species. So is the AI only identifying the common species and the rarer species are slipping through the cracks? Probably not, given that we have 117 species. But it does suggest, perhaps, that the observations were overlooked due to workload and random issues, rather than that people are having difficulties with some species and thus "ignoring" them, (Of course, the AI has not been trained on the rare species so they may still be in the unidentified pile, but hopefully as the AI gets trained, and as more records of species are received, these will be detected at a later date).
Some 57 identifiers have been involved, but many of these were to "plant" so how many of these contributed anything valuable to the ultimate identification is difficult to evaluate. Remember that despite these higher IDs, these observations were not identified until they were rescued by the AI.
Some 286 (95%) are identified to species or lower, and most of the remainder require an agreement to move them to species level. So those that are proteas are easily idenitifiable - it is not that they are problematic observations for identification.

Outside of southern Africa, we have the complication that I dont really know the Australian species well. We also have the complication that the AI is trained on southern African Proteaceae, so wont detect the Australian species anyway. Still we do have some species as alien invaders or as garden plants, so the AI is aware of those and will identify them.

We have 286 observations of 25 species.

  • 83 observations were only made to generic level as follows:
    65 Grevillea Grevilleas
    11 Banksia Banksias
    4 Macadamia Macadamias
    2 Stenocarpus Firewheels
    1 Telopea Waratahs

  • 203 were identified to species (by 4 identifiers)
    61 Leucadendron laureolum × salignum Safari Sunset and Similar Cultivars
    42 Leucospermum × hybridum Pincushion Hybrids
    32 Grevillea robusta Silky Oak
    15 Protea × hybrida Sugarbush Hybrids
    11 Leucospermum cordifolium × patersonii High Gold and Derived Cultivars
    7 Protea cynaroides King Protea
    4 Leucadendron argenteum Silvertree
    4 Hakea drupacea Sweet Needlebush
    4 Leucadendron discolor × gandogeri Cloudbank Jenny
    4 Leucospermum lineare × reflexum Brandi Dela Cruz
    4 Leucadendron × hybridum Conebush Hybrids
    3 Leucadendron laureolum × strobilinum Goldstrike
    2 Banksia ericifolia Heath-leaved Banksia
    1 each of Stenocarpus sinuatus Firewheel Tree, Adenanthos sericeus Woolly Bush, Embothrium coccineum Chilean Fire Bush, Ls cordifolium The Pincushion, Pr laurifolia Grey Sugarbush, Se florida Blushing Bride, Au umbellata Broadleaf Featherbush, Ld galpinii Hairless Conebush, Ld eucalyptifolium Gumleaf Conebush, Ls mundii Langeberg Pincushion

As above this more or less mirrors the abundance of species recorded so far on iNaturalist, so there are no real surprizes here. What is important to remember is that the AI is not trained on hybrids, so it is detected the hybrids "in error" for other species in the family. Note how many hybrids feature near the top!
Note also that Greviilles and Banksia are also very popular.

253 of these were from the USA (235 82% from California), 9 from Europe, 8 from South America, 0 from Australia.

Note that the paucity of identifiers is interesting. Proteas feature prominently in gardens, and there are hybridization schemes producing new cultivars in Hawaii (and California?), so it is surprizing that there are so few identifiers. On the other hand, perhaps horticulturalists do not use iNaturalist and therefore wont be aware of the ID gap.

So all in all, a great fishing expedition! The AI tool is certainly most useful in pulling lost observations from oblivion, and I can see it becoming an essential and eventually a standard tool for assisting with identifications.

It is worth noting that 361 (62% of the 586 Proteaceae) are marked casual. It is thus not entirely unexpected that these were not identified as they are not in the Needs ID queue and thus easily overlooked. .
Some 162 are Needs ID (28%) and 38 Research Grade (6%) You can help with getting observations to Research Grade here

Posted on December 24, 2022 10:46 PM by tonyrebelo tonyrebelo | 2 comments | Leave a comment