Hypothesis

219 Matching Annotations

Dec 2022
www.biorxiv.org www.biorxiv.org

Multiple paths towards repeated phenotypic evolution in the spiny-leg adaptive radiation (Tetragnatha; Hawaii)

1
1. austinhpatton 10 Dec 2022
  
  in Arcadia Science - Private
  
  Phyluce allows extraction of UCEs, but because of the missing data due to low coverage and fragmented genomes, we were only able to retrieve 29 UCEs that were present in half of the dataset (16 of the UCEs were present in all individuals; and 23 UCEs present in ≥70 individuals).
  
  I may have missed it, but I don't believe it is mentioned in the methods what software was used for phylogenetic inference of the recovered UCEs? If my intuition is correct, I would guess IQ-Tree, the same as for mitochondrial gene? It might be worth clarifying here.
  
  Related, was a single concatenated phylogeny used? Or did you infer gene trees as well? IQ-Tree is capable of easily inferring both (gene trees and concatenated species tree - see here: http://www.iqtree.org/doc/Concordance-Factor). This would also enable to you infer multiple concordance factors which could help to summarize genealogical concordance, and potentially even quantify per-site discordance using your concatenated SNP dataset?
Visit annotations in context

Annotators

austinhpatton

URL

biorxiv.org/content/10.1101/2022.11.29.518358v1
Nov 2022
www.biorxiv.org www.biorxiv.org

Molecular early burst associated with the diversification of birds at the K–Pg boundary

8
1. austinhpatton 04 Nov 2022
  
  in Arcadia Science - Private
  
  we have limited intuition about how the mode of molecular evolution may relate to life-history variation.
  
  I'm probably sounding like a broken record now (a good thing - this is super interesting to me, and it's exciting to be nearing a place where we can address this!), but I think analysis of alternative causal models here with phylogenetic path analysis would be an awesome way towards not only generating this intuition, but also explicitly testing these hypotheses!
2. austinhpatton 04 Nov 2022
  
  in Arcadia Science - Private
  
  The observation of a burst of molecular and quantitative trait model shifts within a ∼5 Ma interval of the K–Pg boundary (Figure 1) is likely not a coincidence
  
  That's one possibility - though it might be worth addressing/discussing the alternative hypothesis - that the pattern of mass (and presumably non-random with respect to body size) extinction and subsequent radiation (i.e. the early burst) could lead to surviving lineages exhibiting greater disparity in rates of molecular evolution and other life history traits.
  
  As I alluded to in my comment on figure 1, it might be worth (if feasible) explicitly assessing the probability of observing such a concentrated distribution of shifts near the time of a mass extinction event by simulating under a constant rate model of nucleotide substitution, and a mass extinction/early burst of speciation/turnover.
3. austinhpatton 04 Nov 2022
  
  in Arcadia Science - Private
  
  shifts toward weaker scaling relationships between metabolic rate and body mass (e.g., in Neoaves; Figure 3).
  
  Would be exciting to explicitly test this!
4. austinhpatton 04 Nov 2022
  
  in Arcadia Science - Private
  
  we find evidence that shifts in the mode of genomic evolution were very likely concurrent with shifts in the evolutionary optima θ(t) of important avian life-history traits (Figure 2), as well as shifts in metabolic allometric slope βmass and intercept β.
  
  Super cool - this is partly where/why the use of phylogenetic path analysis may be particularly informative - could these shifts in evolutionary optima have been associated with a shift in causal relationships among biological/life history features? Maybe it's possible to infer, at least in part, the causal mechanisms behind these shifts in metabolic allometry?
5. austinhpatton 04 Nov 2022
  
  in Arcadia Science - Private
  
  Overall, these patterns support the hypothesis that molecular model shifts are indicative of evolutionary shifts in life-history θ(t).
  
  I'm not necessarily suggesting you do this, since this paper already represents an impressive effort, but I wonder whether the use of phylogenetic path analysis (https://doi.org/10.7717%2Fpeerj.4718) might be informative here?
  
  Within each inferred molecular model shift, you could test alternative causal models depicting the relationships between rates of molecular evolution and these natural history traits, as you might anticipate that many of these covary, themselves being causally related to one another. Inference of shifts in causal relationships among natural history traits in each clade that experienced shifts in rates of molecular evolution could then provide some intuition as to why!
6. austinhpatton 04 Nov 2022
  
  in Arcadia Science - Private
  
  Figure 1.
  
  How feasible would it be to infer these deviations from equilibrium base frequencies from simulated data? For instance, simulating trees/sequences under constant substitution rates (e.g. using AliSim - https://doi.org/10.1093/molbev/msac092) but under a model of mass (selective) extinction and subsequent early burst of diversification (e.g. TreeSim - https://doi.org/10.1093/sysbio/syr029)?
  
  I ask because I wonder whether simulating under the hypothesized diversification model and constant rates of molecular evolution could provide an estimate of the false-positive rate with which Janus might infer these shifts and whether false positives tend to be concentrated around the time of the simulated mass extinction?
  
  I realize this is probably a big ask though - the paper is already super impressive! I think this analysis could just be particularly compelling evidence if in fact this concentration of events is greater than expected.
7. austinhpatton 04 Nov 2022
  
  in Arcadia Science - Private
  
  Fifteen molecular model shifts are identified on total-clades estimated to have originated near the K–Pg boundary [43-46] (Figure 1, Supplementary Figure 1, 7a-d).
  
  I'm curious - If you were to simulate sequences/trees under a model of mass-extinction and early burst but constant substitution rates, how often (if at all!) would Janus infer substitution rate shifts near the extinction event?
  
  Could the extinction-driven bottleneck and heightened species turnover in the subsequent radiation amplify apparent differences in nucleotide composition even with molecular rates remaining constant? If so, could it be that alternative diversification dynamics/models impact the inference of substitution rate shifts?
8. austinhpatton 04 Nov 2022
  
  in Arcadia Science - Private
  
  Our analyses reveal well-supported shifts in estimated equilibrium base frequencies across exons, introns, untranslated regions (UTRs), and mitochondrial genomes. Remarkably, model shifts are mostly constrained to previously hypothesized clade originations associated with the K–Pg boundary.
  
  Is there any chance that this pattern could, at least in part be attributed to the relative paucity of branches in the tree prior to the K-Pg boundary?
  
  In other words, could the identification of molecular rate shifts by Janus be impacted by "sample size" differences of branches prior to and following mass extinctions as in this example?
Visit annotations in context

Annotators

austinhpatton

URL

biorxiv.org/content/10.1101/2022.10.21.513146v1
Sep 2022
www.biorxiv.org www.biorxiv.org

Evolutionary dynamics of genome size and content during the adaptive radiation of Heliconiini butterflies

10
1. austinhpatton 06 Sep 2022
  
  in Arcadia Science - Private
  
  (A, B) The contribution of TEs and CDS to genome size variation across Heliconiinae, respectively.
  
  These two relationships should probably be assessed in a phylogenetic context, accounting for shared evolutionary history and non-independence among species (e.g. using phylogenetic least-squares regression - pgls). There is clear phylogenetic structuring in these two plots. For instance, Dione/Agraulis appear to have lower TE content for their genome sizes (shallower slope and intercept), whereas Erato seems to accumulate TEs more rapidly for their corresponding genome size as compared to other clades. Raw points and the original regression slope certainly provide taxonomic intuition, but to summarize the relationship across the entire clade, a regression slope as inferred by PGLS (for instance) should be reported.
2. austinhpatton 06 Sep 2022
  
  in Arcadia Science - Private
  
  We illustrate how dense genomic sampling improves our resolution of gene-phenotype links, and our understanding of how genomes evolve.
  
  Really exciting work, and an immense amount of work - congratulations! This was a fun read!
  
  It seems that towards the conclusion of the paper, you outline how your data does not appear to be consistent with one of the prevailing hypothesized mechanisms for the digestion of pollen in Heliconius. I can't help but wonder if, using the data you present here, you might be able to point towards individual gene-families/orthogroups that are consistent with the evolution of this key-innovation?
3. austinhpatton 06 Sep 2022
  
  in Arcadia Science - Private
  
  bar plots show total gene counts partitioned according to their orthology profiles, from Nymphalids to lineage-restricted and clade-specific genes.
  
  What does the distribution of mean per-species gene-count look like across all orthologs? Is there a small handful of very large orthogroups that are observed in all species and comprise the majority of the gene-counts shown here?
4. austinhpatton 06 Sep 2022
  
  in Arcadia Science - Private
  
  From each OG in-paralogs were removed (custom python script RemoveInParalogFromTree.py available at https://francicco@bitbucket.org/ebablab/custum-scripts.git). If the procedure generated a single-copy OG (nscOGs) it was analysed by contrasting evolutionary pressures between Eueides and Heliconius species. The signature of selection (aBSREL) and relaxation (RELAX (114)) were performed as implemented in HyPhy.
  
  Could removal of in-paralogs influence/impact the signatures of selection that you recover using aBSREL and RELAX? It might be worth assessing whether these results are robust to the inclusion of in-paralogs.
  
  For instance, neofunctionalization of in-paralogs following duplication is likely to manifest signatures of positive selection (Fernández et al., 2020: https://doi.org/10.1093/molbev/msaa110, Wang et al., 2015: https://doi.org/10.1007/s11103-015-0285-2). Alternatively, pseudogenization of in-paralogs is expected to lead to a relaxation of selection on the duplicated gene-copy (e.g. Emerling 2018: https://doi.org/10.1016/j.ympev.2017.09.016; Calderoni et al., 2016: https://doi.org/10.1038/hdy.2016.59).
5. austinhpatton 06 Sep 2022
  
  in Arcadia Science - Private
  
  All sequences from each of the four clades were realigned separately and a number of tests implemented in HyPhy (overall ω; SLAC; aBSREL; RELAX). In particular, the sign of diversifying positive selection (aBSREL) was detected by scanning all internal branches of the whole cocoonases phylogeny, correcting for multiple tests using a final P-value threshold of 0.05.
  
  If I'm reading this correctly, it seems that for each analysis, all branches were tested for signatures of selection (with the exception of SLAC which assumes selection is constant across the entire tree)? Although RELAX and aBSREL can be used to test each branch, each are significantly more powerful when used in a hypothesis-testing framework, assigning a subset of branches to the "background" (control) and the remainder to the "foreground" (treatment), and estimating parameters within each test-set. The per-branch tests are generally considered to be most useful in exploratory use-cases.
  
  I think this use-case scenario is ideally suited to the hypothesis testing framework of these two models. Specifically, to test the hypothesis that selection on cocoonases intensified (or was relaxed) in H. aoede as compared to the rest of the Heliconius butterflies, you could treat H. aoede as the foreground, and the rest of Heliconius as the background for both RELAX and aBSREL. It's unclear whether cocoonases in Eueides or other species within Heliconiini should be experiencing the same/similar selection regime as in H. aoede, so I would suggest excluding those species from this specific analysis.
6. austinhpatton 06 Sep 2022
  
  in Arcadia Science - Private
  
  We next explored the relationship between transposable elements (TEs) and genome size, and their effect on gene architecture. We found that larger genomes tend to have a distribution of intron length skewed towards longer introns (fig. S16a), with a positive correlation between median intron length and total TE content (fig. S22; Pearson’s ρ=0.72; R2=0.51). Long introns also accumulate significantly more TEs then expected by their size (fig. S23; Wilcoxon rank-sum test P value = 2.13×10-13), with the effect of changing gene structure more than gene density (fig. S22 and fig. S24).
  
  It's difficult to contextualize these results without access to the supplementary materials (at the time I'm reading this), but it's unclear to me why the choice was made to subset introns into those that are short and long (as in Fig. 4A). I'm particularly wondering about the choice to divide introns as short/long at the median, which doesn't appear to sort them according to a natural break in the length-distribution. Dividing introns in this way, and then summing for each species and each 'TE size-class' runs the possible risk of exaggerating the observed difference in intron length and TE content.
  
  Additionally, the relationship between TE content and total intron length (regardless of how these traits are defined) should be analyzed in a phylogenetic context (e.g. using phylogenetic least-squares regression - pgls).
7. austinhpatton 06 Sep 2022
  
  in Arcadia Science - Private
  
  Hemocyanins are also among several GFs with evidence of divergent selection regimes (ω) between Eueides and Heliconius, alongside Trypsins, Protein kinases, P450s, Sugar transporters, Ion and ABC transporters (Fig. 5B).
  
  What was the estimated ω for H. aoede at these gene-families as compared to those distributions estimated for Eueides and Heliconius? If involved in the evolution of pollen-feeding, I think that we would expect ω to be more similar between H. aoede and Eueides than between H. aoede and Heliconius?
8. austinhpatton 06 Sep 2022
  
  in Arcadia Science - Private
  
  Curiously, a contraction in the Hemocyanin superfamily was only observed in H. aoede, the only Heliconius species not to feed on pollen in our data, marking hexamerins as a potential mechanistic link to the divergent strategies for nitrogen storage in pollen-feeding Heliconius (Fig. 5A).
  
  Interesting - this seems like a perfect complement for the analysis of Cocoonases later on! Would be very interesting to apply RELAX/aBSREL in the hypthesis-testing framework (i.e. foreground/background - discussed in later comment) to this gene family to test the hypothesis that this gene-family contraction in H. aoede is paired with an increased intensity of positive selection as compared to other Heliconius butterflies.
  
  What sre the members of the Hemocyanin superfamily that contracted within H. aoede? Are they the same (or do they include) the two Hemocyanins that were found to have previously undergone expansions?
9. austinhpatton 06 Sep 2022
  
  in Arcadia Science - Private
  
  optimizing parameters (Romain Derelle, personal communication) for a more reliable list of single-copy orthologous groups (scOGs).
  
  How might this approach to identify orthologs, which is tailored to the identification of single-copy orthologs, impact downstream inferences about patterns of gene-family evolution? Put another way, if the approach described here is more prone to recovering scOGs (or low-copy number OGs), then it seems likely that fewer multi-copy number (particularly large/high-copy number) OGs will be identified as a result. If so, the rate estimates of gene-family expansion/contraction using CAFE seem likely to be downwardly/upwardly biased respectively, as the majority of gene-families under consideration are particularly conserved with respect to copy-number.
  
  I think this possibility would be worth exploring/addressing. For instance, if the implementation of broccoli is relaxed so as to be less tailored to the recovery of scOGs, does the shape of the gene-family expansion/contraction distribution fundamentally change?
10. austinhpatton 06 Sep 2022
  
  in Arcadia Science - Private
  
  Ancestral state reconstruction of genome size was assessed using The maximum likelihood method implemented in the R package phytools (68).
  
  Under what model specifically was ancestral state reconstruction (ASR) conducted? The ML method in phytools can conduct ASR under three models (Brownian Motion, Ornstein–Uhlenbeck, and Early Burst). If not already done so, I might suggest fitting these alternative models of trait evolution prior to conducting ASR under the best-fit model.
  
  Given the taxonomic scale, it might also be worth considering including fitting some multi-regime BM/OU models as implemented in OUwie: https://doi.org/10.1111/j.1558-5646.2012.01619.x.
Visit annotations in context

Annotators

austinhpatton

URL

biorxiv.org/content/10.1101/2022.08.12.503723v1

Annotators

URL

Annotators

URL

Annotators

URL