- Feb 2019
-
paleorxiv.org paleorxiv.org
-
Visualisation of homology and convergence as trajectories in morphospace through 2time.
How likely is it that two different lineages evolve the same morphology by chance? Traits and trait complexes can be convergently evolved, but not entire phenotypes. All (pseudo)cryptic species in higher (complex) organisms have been found within members of the same lineage (i.e. are the result of parallel evolution, homoiologies)
One would want to stress the difference between a phenotype itself and individual characters scored to represent the character.
-
omoiology as illustrated by Hennig
This illustrates a basic, overlooked problem with the proposed approach: any reconstruction would infer a' as a synapomorphy of the right clade.
-
d (B,C) choose between trees, as implemented in the 3function homoiologies
The graphical example is not a good one. The function would reconstruct "0" as the ancestral state for the "0"-containing subtree in B, hence, not get this as a potential homoiology but dismiss it as symplesiomorphy (in contrast to the real-world example in A). It would be clearer if just two tips would have shown "1" in B and C (we often associate "0" with ancestral, and "1" with derived)
-
that tree C should be preferr
Little error. "tree B should be preferred" (makes more sense and says so in the text)
-
y, these occurrences –which 38arguablycan be considered the norm rather than the exception in related taxa –may 39provide useful evidence of relatednes
This is very true.
And one reason more why especially palaeontologists should stop ignoring distance-based networks (following the Farris'ian Dogma that "distance = phenetic", but see Felsenstein, 2004, Inferring Phylogenies) as a tool to explore the non-trivial signal in their data sets — some application examples posted at the Genealogical World of Phylogenetic Networks; see also Denk & Grimm, Rev. Pal. Pal. 2009, Bomfleur et al., BMC Evol. Biol. 2015, —, PeerJ, 2017. Even in the absence of reticulation, evolving morphologies do not provide tree-like signal, because synapomorphies, characters fully compatible with the true tree, are rare, homoiologies common, and convergences, characters incompatible with the true tree, inevitable.
The less tree-like the signal and the more different the individual probabilities for change, the more misleading or ambiguous will be the parsimony tree reconstruction. Neighbour-nets may appear to be crude tools, but are quick-to-infer, designed to handle data incompatibility. Consensus networks are, in any possible aspect, more informative than a strict or majority rule consensus tree.
Instead of trying to decide between equally and inevitable biased trees, we can just explore our data, using networks. See pic, depicting all potential synapomorphies, bold, symplesiomorphies, italics, and homoiologies that can be inferred directly from the crocodilian morphomatrix. Naturally including pseudo-synapomorphies (red) when compared to the provided molecular tree.
PS That the way out of the dilemma is to embrace networks has been realised very early in evolutionary sciences (long before Hennig and Farris).
-
n they will share similar genes, but it 18is the phenotype –upon which selection acts –which is crucia
There two important things to note.
If the same genetic programme leads to two phenotypes because of the environment, this falls in the category of epigenetics. Epigenetic processes are usually not tree-like, hence, poorly modelled by inferring a tree.
You implicitly assume (via your R-script) that homoiologies (in a strict sense, i.e. parallelism) are rare and not beneficial (neutral). But if the homoiology is beneficial (i.e. positively selected for), it will be much more common in a clade of close relatives than the primitive phenotype (the symplesiomorphy). We can further assume that beneficial homoiologies will accumulate in the most-derived, advanced, specialised taxa, in the worst case (from the mainstream cladistic viewpoint) mimicking or even outcompeting synapomorphies. A simply thought example: let's say we have a monophylum (fide Hennig) with two sublineages, each sublineage defined by a single synapormorphy. Both sublineages radiate and invade in parallel a new niche (geographically separated from each other) and fix (evolve) a set of homoiologies in adaptation to that new niche. The members of both sublineages with the homoiologies will be resolved as one clade, a pseudo-monophylum, supported by the homoiologies as pseudo-synapomorphies. And the actual synapomorphies will be resolved as plesiomorphies or autapomorphies.
Without molecular (and sometime even with, many molecular trees are based on plastid in plants and mitochondria in animals, and both are maternally inherited, hence, geographically controlled) or ontological-physiological control it will be impossible to make a call what is derived (hence a potential homoiology) and what ancestral in a group of organisms sharing a relative recent common origin and a still similiar genetic programme.
-
g likelihood or Bayesian probabilistic phylogene
If you have a molecular data partition, you can just use total evidence approach and the standard 1-parameter Markov model.
Potential synapomorphies will be compatible with the molecular tree and considered not likely to change. Potential homoiologies and symplesiomorphies are partly ("semi-")compatible with the molecular tree and, hence, considered less likely to change than highly homoplastic traits with (random) convergence.
Just try out a couple of datasets, and infer the (Bio)NJ and ML trees and then compare the result with the strict consensus network (not tree) of all equally parsimonious trees and the Bayesian tree sample.
Note that if you apply TNT's iterative character weighting procedure, what you effectively do is sorting the random convergences from parallelisms/ characters that are more compatible with the preferred tree.
-
set; if this is higher, the tree 2can be considered to fit the data less well
To test the fit between data and more than one alternative tree, you can just do a bootstrap analysis, and map the results on a neighbour-net splits graph based on the same data.
Note that the phangorn library includes functions to transfer information between trees/tree samples and trees and networks:<br/> Schliep K, Potts AJ, Morrison DA, Grimm GW. 2017. Intertwining phylogenetic trees and networks. Methods in Ecology and Evolution (DOI:10.1111/2041-210X.12760.)[http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12760/full] – the basic functions and script templates are provided in the associated vignette.
-
number of character states bracketed by a node could 28be counted, and those which do not optimize as symplesiomorphies of the clade could 29considered as a value of node suppo
Why differentiating between symplesiomorphies and homoiologies? Both are traits equally exclusive to a subtree (a clade) and closely linked to each other.
Pink is a symplesiomorphy of the ingroup, blue the homoiology Vice versa, blue is the symplesiomorphy, pink the homoiology found only in the most derived (in absolute evolutionary terms) taxa
-
n inference of phylogeny by parsimony, an occurrence of a character state in a part of a tree 14separated from it by another state is considered simply a homoplasy, and a tree where the 15occurrences are nearer or further from one another is notmore or less parsimoniou
Why one should not use exclusively parsimony to infer phylogenetic relationships.
For distance-methods, it will (often) make a difference, as it will for probabilistic inferences (ML or Bayes)
-
In principle, I do sympathize with the general idea, but the laid out approach will have little use.
The main drawback is that you can only define homoiologies using an external data set (e.g. the molecular "gold" tree). But when you have a reliable molecular tree, you can just go for total evidence approaches to select a more likely, in a mathematical and general sense, alternative without the need to make any prior destinction between your characters. Homoiologies will be inferred, like synapmorphies or symplesiomorphies or shared apomorphies (non-stochastically distributed convergences) on the fly.
If you define the homoiologies on a inferred (e.g. parsimony) tree only based on a morphological data matrix (e.g. for an extinct group of organisms), you will inevitably misinterpret some characters, because your clades are not necessarily monophyletic. Homoiologies like symplesiomorphies may appear as (pseudo-)synapomorphies.
The only application left would be that the molecular tree cannot resolve certain relationships, and we use more tree-compatible morphological characters to discern between alternatives. However, the first choice would then be to maximise the number of synapomorphies. Only if that would be the same for all alternatives, one could count the number of symplesiomorphies and homoiologies (as the distinction between both via a tree-inference is very tricky; and their are often just two side of the same evolutionary process).
However, one could also just directly change to a network-analysis framework, which will pretty much solve all these problems at once.
For further details see my (upcoming, March 4th) post at Genealogical World of Phylogenetic Networks
Tags
- treelikeness
- symplesiomorphy
- phylogenetic networks
- homoiology; compatibility methods; total evidence
- phenotypic evolution
- data compatibility
- homoiology
- EDA
- probabilistic methods
- morphology
- topological ambiguity
- exploratory data analysis
- implicit weighting
- character compatibility
- molecular evolution
- expression probability
- graphics
Annotators
URL
-