Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.
Learn more at Review Commons
Reply to the reviewers
Revision Plan
1. General Statements
We thank the reviewers for their positive and constructive assessment of the manuscript. We are encouraged that all three reviewers recognise the value of coelsch as an open-source framework for haplotyping and crossover detection from single-cell gamete sequencing data, and that they view the study as a useful contribution to the fields of recombination and genetic research. We are particularly grateful that Reviewer 1 described the manuscript as an "interesting and important study" and a "genuinely useful methodological framework that fills a real gap in the recombination biology toolkit", while Reviewer 2 highlighted its "strong innovation, complete technical pipeline, and significant biological implications" and considered it an "important technical breakthrough". We also appreciate Reviewer 3's assessment that the study provides "timely guidance for experimental design", that the results are "important for guiding plant single-cell research" in general, and that the work "has the potential to attract a broad readership".
In our view, the main contribution of the manuscript is the development of a platform-agnostic method for recovering haplotypes and crossover events from single-cell sequencing data. This addresses an important practical gap: single-cell gamete sequencing has strong potential for high-throughput haplotyping and recombination mapping, but its broader use requires tools that can accommodate the very different coverage structures produced by different sequencing modalities and platforms. coelsch was designed to meet this need.
The experimental datasets in the manuscript serve two purposes. First, they demonstrate that coelsch can be applied across multiple single-cell modalities and platforms, including scRNA, scATAC and scWGA sequencing from 10x Genomics, BD, and Takara platforms. Second, they illustrate the kinds of biological and practical questions that can be addressed with single-cell gamete sequencing, including crossover detection in meiotic mutants and large-scale analysis of natural variation in recombination.
While all reviewers strongly supported the publication of the work, they also raised important points about specific aspects, including technical variation and reproducibility, the rationale for using 10x scRNA to generate the diversity panel dataset, and the effects of coverage on crossover localisation, amongst others. We agree that addressing these points will make the manuscript clearer and more useful to readers. Our planned revisions therefore aim to strengthen the experimental and computational support for the framework, clarify the interpretation of the modality comparisons, and provide additional guidance for researchers who may wish to apply coelsch or related single-cell sequencing approaches in future studies.
2. Description of the planned revisions
2.1. Additional technical replicates and clearer treatment of batch/sample-handling effects
Reviewers 1, 2 and 3 all noted that the comparison of different platforms and modalities is based on limited replication, with different nuclei isolation and processing strategies used for different technologies. Reviewer 3 requested a fully controlled benchmark in which the same nuclei preparation is split across all tested platforms. We agree that this would be the ideal design for a dedicated head-to-head benchmarking study. However, the primary aim of the manuscript is to demonstrate the applicability of coelsch across different single-cell sequencing data types, rather than to provide a definitive benchmark of the intrinsic performance of each modality and platform.
In addition, a fully matched and replicated cross-platform experiment for all technologies is not feasible. Isolated nuclei deteriorate rapidly after preparation and must be processed promptly for single-cell library construction; this makes it impractical to distribute the same preparation across multiple time- and labour-intensive workflows. However, this design is feasible for 10x scRNA-seq and 10x scATAC-seq. To address this point directly, we will therefore generate two matched technical replicates each of 10x scRNA-seq and 10x scATAC-seq from nuclei isolated in the same sorting run.
We will also improve our library-level QC summary tables. We will report, where available, the number of nuclei used for loading, recovered barcodes, barcodes retained after QC, inferred high-quality nuclei and artefacts, informative fragments per nucleus, genomic bin coverage, and final nuclei used for crossover calling. This will make the effects of loading, capture efficiency, QC filtering, and modality-specific data loss more transparent.
In the revised text, we will distinguish more clearly between modality-specific effects and possible batch/sample-preparation effects. Where the current manuscript implies that differences are intrinsic properties of sequencing platforms, we will soften the interpretation unless supported by the new replicate data, reproducibility analyses, or well-supported properties that have been reported previously in literature.
2.2. Rationale for using 10x scRNA-seq in the natural variation panel
Reviewers 1 and 3 asked why the natural variation panel was analysed using 10x scRNA-seq, given that Takara scWGA produced higher per-cell crossover localisation accuracy in the modality comparison. We will revise the manuscript to explain this experimental decision more clearly.
The natural variation panel was designed as a high-throughput experiment requiring sufficient numbers of usable nuclei from many pooled F₁ hybrids. In our hands, 10x scRNA-seq has generally produced the largest number of usable nuclei barcodes and the lowest proportion of artefacts. This makes 10x scRNA-seq well suited to experiments where many nuclei are required per genotype. By contrast, applying Takara scWGA to a pooled panel of this scale would be expected to recover only tens of usable nuclei per F₁ hybrid, which would be insufficient for robust recombination-rate or landscape estimation.
We will add this explanation to the relevant Results section and clarify that the choice of 10x scRNA-seq reflects a trade-off between per-cell crossover resolution and the number of informative nuclei recovered per genotype. We will also add genotype-level summaries for the pooled natural variation experiment, including assigned nuclei per genotype and genotype-specific genomic coverage of informative fragments.
2.3. Reproducibility of recombination landscapes across replicates and modalities
Reviewer 1 requested recombination landscape plots for all tested modalities, and several comments raised the need to show within-modality reproducibility. We will add recombination landscape plots for wild-type Col-0 × Ler libraries across the tested modalities, including the newly generated replicate 10x scATAC and scRNA libraries.
We will assess reproducibility using comparisons of unsmoothed, non-overlapping windowed recombination-rate estimates, both within and between modalities. These will be quantified using bootstrapped estimates of spearman rank correlation coefficient, and visualised using scatterplots and/or recombination landscapes.
2.4. Sequencing depth, coverage, and crossover localisation resolution
Reviewers 1 and 3 requested clearer quantitative reporting of crossover resolution and a stronger analysis of depth effects. We will revise the manuscript to report practical crossover localisation resolution for each modality, including median and interquartile localisation error or interval size in genomic units.
We will expand the simulation analyses to compare false-positive and negative rates and localisation accuracy across modalities, including telomere-proximal error profiles for scWGA and scATAC as well as 10x RNA data. We will perform downsampling analyses to assess how crossover detection accuracy changes as a function of informative-fragment depth. Where feasible, we will compare depth-matched subsets across modalities to distinguish effects of sequencing depth from modality-specific coverage structure.
These analyses will be used to clarify the extent to which each modality is suitable for different applications, such as broad landscape estimation, crossover counting, or fine localisation.
2.5. Artefact detection, high doublet rates, and representativeness after filtering
All three reviewers raised concerns about the high proportion of barcodes excluded by the filtering procedure, particularly in the Takara scWGA dataset. In hindsight, we believe part of this concern stems from the poor choice of terminology ("doublets") we used to describe these excluded barcodes.
While true doublets (i.e. two nuclei entering a single droplet or nanowell) are one likely source of such signals, the filtering procedure more broadly identifies artefactual barcodes that do not exhibit a clear single-gamete haplotype structure. These barcodes may arise from a variety of sources, including doublets, multiplets, high levels of ambient DNA or RNA, or empty droplets containing only ambient material. Although visual examination can be used to make predictions about the source of these artefacts, our detection method does not attempt to distinguish between them, and artefacts in different modalities may stem from different sources in varying proportions. We will therefore revise the terminology throughout the manuscript to clarify that these represent a broader class of low-confidence or noise barcodes, rather than confirmed doublets.
For the Takara scWGA data, we will revise the manuscript to discuss the discrepancy between the CellSelect well classifications (which uses proprietary software to label doublets) and the final artefact predictions from coelsch. We can only speculate as to why CellSelect failed to detect many apparent doublet and multiplet artefacts in this experiment, but we agree with the reviewer that the most likely explanation is the small size of Arabidopsis pollen nuclei relative to the expectations of the imaging and classification procedure. To support this interpretation, we will add supplementary analysis comparing the CellSelect images from individual nanowells with the final doublet predictions inferred from scWGA data. This will allow readers to see examples of wells classified as acceptable by CellSelect but subsequently inferred to contain artefacts based on their haplotype structure.
We will also add sensitivity analyses showing how key results change under different artefact-filtering thresholds. These analyses will include crossover count distributions, recombination landscape estimates, and modality-level comparisons. We will examine the extreme upper tail of crossover counts observed in 10x scATAC-seq and assess whether these barcodes are artefacts that have escaped detection.
Finally, we will assess whether retained singlets are representative of the input data with respect to informative-fragment counts, coverage, and inferred crossover patterns. This will address the concern that filtering could preferentially remove nuclei with particular recombination profiles.
2.6. Biases arising from pollen nuclear biology
Reviewer 2 raised an issue concerning the biases arising from the two different nuclei types present in mature trinuclear Arabidopsis pollen, and reviewer 3 endorsed this point. While we do not agree with the reviewer that scRNA and scATAC cannot capture sperm nuclei due to their condensed nature (see Parker et al. 2025 PLoS Biology for evidence against this claim), it is true that technical variation in nuclei isolation and sorting may affect the relative representation of nuclei types - usually, however, resulting in the underrepresentation of vegetative nuclei (Parker et al. 2025). We will add text addressing this point to the manuscript.
It is also true that differences in expressed genes between vegetative and sperm nuclei, which have very different transcriptomic profiles, will affect the distribution of informative reads for crossover analysis in scRNA data, and therefore may also have an impact on the recovered recombination landscapes (despite that the underlying landscapes are biologically identical). We will address this in the manuscript by adding recombination landscape plots and reproducibility scatterplots (as described in point 2.3) comparing sperm and vegetative nuclei from scRNA-seq to the manuscript.
2.7. Robustness of the pipeline and parameter choices
Reviewer 3 raised the concern that quantitative conclusions depend on a single pipeline with fixed parameter choices. We will address this by adding a parameter-sensitivity analysis for the main computational steps. Specifically, we will test the robustness of crossover calling on simulated data to changes in bin size and rHMM parameters, showing how these affect sensitivity to noise and agreement of predictions with ground truth data.
2.8. Natural variation analysis: genotype-specific coverage and terminal crossover enrichment
Reviewers 1, 2 and 3 raised concerns about whether natural variation in crossover rate and terminality could be influenced by genotype-specific coverage, marker density, pooling imbalance, or dropout. We will add a more detailed description of how pollen from different F₁ hybrids was pooled and how genotype assignment was performed. We will report genotype-level recovery statistics, including the six hybrids excluded from downstream analysis, and discuss how imbalances may arise, e.g. through biological variation in pollen count and fertility, biases in nuclei isolation or sequencing, and biases in genotyping and informative fragments.
Reviewer 1 specifically asked whether the lower terminal crossover index observed in Cvi-0 crosses compared with Col-0 crosses could reflect systematic differences in informative-fragment distributions rather than true biological differences in crossover localisation. We will address this by using the genotype-specific informative-fragment distributions observed in the diversity-panel scRNA-seq dataset to simulate crossover datasets with known ground truth. This will allow us to test whether differences in marker variant or expressed-gene distributions causing variation in informative-fragment distribution could systematically bias terminal crossover detection in Cvi-0 crosses relative to Col-0 crosses.
If feasible within the revision timeframe, we will also perform an orthogonal validation experiment for a selected comparison showing a clear difference in crossover terminality, such as Col-0 × Sah-0 and Cvi-0 × Sah-0. This would use progeny sequencing of backcross populations to estimate recombination landscapes independently of single-cell scRNA-seq, providing a direct test of whether the inferred terminality difference is supported by conventional recombination mapping. If this experiment cannot be completed within the revision timeframe, we will clearly state this limitation and base the revised interpretation on the simulation analyses described above.
2.9. Broader applicability and practical guidance for users
Reviewer 1 requested more discussion of applicability beyond Arabidopsis and to outcrossing or polyploid species. We will expand the Discussion to address the requirements and limitations of applying coelsch in other systems.
2.10. Minor figure, reference, and presentation revisions
We will address the remaining minor comments, including adding missing axis labels and checking duplicated references.
3. Description of the revisions that have already been incorporated in the transferred manuscript
No revisions have yet been incorporated in the transferred manuscript.
4. Description of analyses that authors prefer not to carry out
4.1. Full new benchmark across all modalities from the same nuclei preparation.
As acknowledged in section 2.1, we agree with Reviewer 3 that a fully controlled benchmark in which the same isolated nuclei preparation is split across all tested platforms would be the ideal experimental design for separating intrinsic modality- or platform-specific effects from sample-handling and batch effects. However, this is not feasible for all technologies within the scope of this revision, because isolated nuclei degrade quickly, the single-cell sequencing methods are time- and labour-intensive, and the relevant platforms are not all available to us in the same location.
We will therefore not perform a complete new cross-platform benchmark across all modalities. Instead, we will address this issue in the parts of the experiment where a matched design is feasible: we will generate two additional matched technical replicates each for 10x scRNA-seq and 10x scATAC-seq from nuclei isolated in the same sorting run. We will also revise the manuscript to more clearly acknowledge the limitations imposed by the lack of a fully matched cross-platform design and to ensure that our conclusions are interpreted in that context.
4.2. Profiling the natural variation panel with a second modality
Reviewer 1 suggested profiling at least a subset of the diversity panel with an additional single-cell modality. We agree that this would be useful, but we do not currently plan to generate a second-modality dataset for the natural variation panel. We would like to point out that this dataset introduces 34 genetic maps in a single sequencing experiment, which is not easily repeated.
The natural variation experiment was designed as a high-throughput survey across many F₁ hybrids, and repeating even a subset with scWGA or scATAC would require substantial additional sample preparation and sequencing. Instead, we will strengthen the justification for the use of 10x scRNA-seq by adding genotype-level coverage summaries and simulations to show which conclusions are well supported at the observed data density.
4.3. Orthogonal progeny sequencing from the exact same F₁ plants
Reviewer 3 suggested that progeny sequencing from the same F₁ plants used for single-cell assays would provide a direct ground truth. This experiment would require additional crosses, progeny generation, and matched single-cell and progeny sequencing, which would not be justified by the insights that this effort delivers: While progeny sequencing can provide an independent validation dataset, we do not agree that it would constitute a substantially better ground truth than the simulations used here. Simulations provide a known ground truth for every individual barcode, whereas progeny sequencing cannot, for the obvious reason that pollen grains are destroyed during single-cell sequencing and therefore cannot be used to generate offspring. In addition, progeny-derived recombination landscapes are not a perfect ground truth at the population level, since segregation distortion and post-meiotic selection can alter the observed distribution of recombination events relative to those present in the original pollen population.
4.4. Formal benchmarking of ____coelsch____ as a structural-variant detection method
Reviewer 2 asked whether large structural variants were identified in other accessions besides Zin-9, and what sensitivity and specificity can be expected from recombination coldspot-based structural-variant detection. We agree that this is an interesting question, given that the Zin-9 inversion was identified through its strong effect on recombination. However, we do not plan to develop or benchmark coelsch as a comprehensive structural-variant detection method as part of this revision.
The Zin-9 event was identified by visual inspection of the recombination maps, where it appeared as an unusually large and conspicuous recombination coldspot. We did not develop a systematic structural-variant calling procedure, as we do not view recombination suppression alone as a sufficiently specific signal for structural-variant detection. Coldspots can arise for many reasons, including centromere proximity or local recombination modifiers. Therefore, although large rearrangements such as inversions or translocations may sometimes be detectable through their effects on recombination, coelsch should not be considered as a general-purpose structural-variant caller.
In the revised manuscript, we will clarify this limitation and avoid implying that recombination coldspot analysis provides comprehensive structural-variant discovery. We will report that we did not observe other genotype-specific coldspots of comparable scale to the Zin-9 event among the other analysed accessions, although smaller coldspots such as one corresponding to the previously reported 2.2Mb inversion on Chromosome 1 of N13 were identifiable. We will not provide formal estimates of sensitivity and specificity for structural-variant detection, as this would require independent benchmark datasets or dedicated simulations that are beyond the scope of the present study.