715 Matching Annotations
  1. Jan 2022
    1. Now published in GigaScience doi: 10.1093/gigascience/giaa067 Lisa K. Johnson 1Department of Environmental Toxicology, University of California, Davis2Department of Population Health & Reproduction, School of Veterinary Medicine, University of California, DavisFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Lisa K. JohnsonRuta Sahasrabudhe 3DNA Technologies Core, Genome Center, University of California, DavisFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteTony Gill 1Department of Environmental Toxicology, University of California, DavisFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJennifer Roach 1Department of Environmental Toxicology, University of California, DavisFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteLutz Froenicke 3DNA Technologies Core, Genome Center, University of California, DavisFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteC. Titus Brown 2Department of Population Health & Reproduction, School of Veterinary Medicine, University of California, DavisFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for C. Titus BrownAndrew Whitehead 1Department of Environmental Toxicology, University of California, DavisFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: awhitehead@ucdavis.edu

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa067 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102297 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102296 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.102298

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa062 Victor A. Padilha 1Institute of Mathematics and Computer Sciences, University of São Paulo, Av. Trabalhador São Carlense 400, São Carlos, SP, 13564-002, BrazilFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteOmer S. Alkhnbashi 2Chair of Bioinformatics, University of Freiburg, Georges-Köhler-Allee 101, Freiburg, 79110, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteShiraz A. Shah 3Danish Archaea Centre, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, Copenhagen, DK-2200, DenmarkFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAndré C. P. L. F. de Carvalho 1Institute of Mathematics and Computer Sciences, University of São Paulo, Av. Trabalhador São Carlense 400, São Carlos, SP, 13564-002, BrazilFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteRolf Backofen 2Chair of Bioinformatics, University of Freiburg, Georges-Köhler-Allee 101, Freiburg, 79110, Germany4Signalling Research Centres BIOSS and CIBSS, University of Freiburg, Georges-Köhler-Allee 101, Freiburg, 79110, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: backofen@informatik.uni-freiburg.de

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa062 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102294 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102295

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa051 Amanda Warr 1The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh EH25 9RG, U.K.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteNabeel Affara 2Department of Pathology, University of Cambridge, Cambridge CB2 1QP, U.K.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteBronwen Aken 3European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, CB10 1SD, U.K.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteH. Beiki 4Department of Animal Science, Iowa State University, Ames, Iowa, U.S.A.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteDerek M. Bickhart 5Dairy Forage Research Center, USDA-ARS, Madison, Wisconsin, U.S.A.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteKonstantinos Billis 3European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, CB10 1SD, U.K.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteWilliam Chow 6Wellcome Sanger Institute, Cambridge, CB10 1SA, U.K.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteLel Eory 1The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh EH25 9RG, U.K.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteHeather A. Finlayson 1The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh EH25 9RG, U.K.Find this author on Google ScholarFind this author on PubMedSearch for this author on this sitePaul Flicek 3European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, CB10 1SD, U.K.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteCarlos G. Girón 3European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, CB10 1SD, U.K.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteDarren K. Griffin 7School of Biosciences, University of Kent, Canterbury CT2 7AF, U.K.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteRichard Hall 8Pacific Biosciences, Menlo Park, California, U.S.A.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteGreg Hannum 9Denovium Inc., San Diego, California, U.S.A.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteThibaut Hourlier 3European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, CB10 1SD, U.K.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteKerstin Howe 6Wellcome Sanger Institute, Cambridge, CB10 1SA, U.K.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteDavid A. Hume 1The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh EH25 9RG, U.K.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteOsagie Izuogu 3European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, CB10 1SD, U.K.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteKristi Kim 8Pacific Biosciences, Menlo Park, California, U.S.A.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteSergey Koren 10Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, Maryland, U.S.A.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteHaibou Liu 4Department of Animal Science, Iowa State University, Ames, Iowa, U.S.A.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteNancy Manchanda 11Bioinformatics and Computational Biology Program, Iowa State University, Ames, Iowa, U.S.A.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFergal J. Martin 3European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, CB10 1SD, U.K.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteDan J. Nonneman 12USDA-ARS U.S. Meat Animal Research Center, Clay Center, Nebraska 68933, U.S.A.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteRebecca E. O’Connor 7School of Biosciences, University of Kent, Canterbury CT2 7AF, U.K.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteAdam M. Phillippy 10Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, Maryland, U.S.A.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteGary A. Rohrer 12USDA-ARS U.S. Meat Animal Research Center, Clay Center, Nebraska 68933, U.S.A.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteBenjamin D. Rosen 13Animal Genomics and Improvement Laboratory, USDA-ARS, Beltsville, Maryland, U.S.AFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteLaurie A. Rund 14Department of Animal Sciences, University of Illinois, Urbana, Illinois, U.S.A.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteCarole A. Sargent 2Department of Pathology, University of Cambridge, Cambridge CB2 1QP, U.K.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteLawrence B. Schook 14Department of Animal Sciences, University of Illinois, Urbana, Illinois, U.S.A.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteSteven G. Schroeder 13Animal Genomics and Improvement Laboratory, USDA-ARS, Beltsville, Maryland, U.S.AFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAriel S. Schwartz 9Denovium Inc., San Diego, California, U.S.A.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteBen M. Skinner 2Department of Pathology, University of Cambridge, Cambridge CB2 1QP, U.K.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteRichard Talbot 15Edinburgh Genomics, University of Edinburgh, Edinburgh EH9 3FL, U.K.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteElizabeth Tseng 8Pacific Biosciences, Menlo Park, California, U.S.A.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteChristopher K. Tuggle 4Department of Animal Science, Iowa State University, Ames, Iowa, U.S.A.11Bioinformatics and Computational Biology Program, Iowa State University, Ames, Iowa, U.S.A.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteMick Watson 1The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh EH25 9RG, U.K.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: alan.archibald@roslin.ed.ac.uk tim.smith@ARS.USDA.GOV mick.watson@roslin.ed.ac.ukTimothy P. L. Smith 12USDA-ARS U.S. Meat Animal Research Center, Clay Center, Nebraska 68933, U.S.A.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: alan.archibald@roslin.ed.ac.uk tim.smith@ARS.USDA.GOV mick.watson@roslin.ed.ac.ukAlan L. Archibald 1The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh EH25 9RG, U.K.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Alan L. ArchibaldFor correspondence: alan.archibald@roslin.ed.ac.uk tim.smith@ARS.USDA.GOV mick.watson@roslin.ed.ac.uk

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa051 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102287 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102288

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa061 Saber Hafezqorani 1Canada’s Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC, Canada2Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC, CanadaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Saber HafezqoraniChen Yang 1Canada’s Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC, Canada2Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC, CanadaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteKa Ming Nip 1Canada’s Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC, Canada2Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC, CanadaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Ka Ming NipRené L Warren 1Canada’s Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC, CanadaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for René L WarrenInanc Birol 1Canada’s Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC, Canada3Department of Medical Genetics, University of British Columbia, Vancouver, BC, CanadaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Inanc BirolFor correspondence: ibirol@bcgsc.ca

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa061 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102272 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102273

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa060 Wenxi Wang 1Key Laboratory of Crop Heterosis and Utilization, State Key Laboratory for Agrobiotechnology, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteZihao Wang 1Key Laboratory of Crop Heterosis and Utilization, State Key Laboratory for Agrobiotechnology, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteXintong Li 1Key Laboratory of Crop Heterosis and Utilization, State Key Laboratory for Agrobiotechnology, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteZhongfu Ni 1Key Laboratory of Crop Heterosis and Utilization, State Key Laboratory for Agrobiotechnology, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteZhaorong Hu 1Key Laboratory of Crop Heterosis and Utilization, State Key Laboratory for Agrobiotechnology, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMingming Xin 1Key Laboratory of Crop Heterosis and Utilization, State Key Laboratory for Agrobiotechnology, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteHuiru Peng 1Key Laboratory of Crop Heterosis and Utilization, State Key Laboratory for Agrobiotechnology, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteYingyin Yao 1Key Laboratory of Crop Heterosis and Utilization, State Key Laboratory for Agrobiotechnology, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteQixin Sun 1Key Laboratory of Crop Heterosis and Utilization, State Key Laboratory for Agrobiotechnology, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteWeilong Guo 1Key Laboratory of Crop Heterosis and Utilization, State Key Laboratory for Agrobiotechnology, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: guoweilong@cau.edu.cn

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa060 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102267 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102268

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa044 Benjamin B. Chu 1Department of Computational Medicine, UCLA, Los Angeles, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Benjamin B. ChuKevin L. Keys 2Department of Medicine, University of California, San Francisco, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Kevin L. KeysChristopher A. German 3Department of Biostatistics, Fielding School of Public Health at UCLA, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteHua Zhou 3Department of Biostatistics, Fielding School of Public Health at UCLA, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Hua ZhouJin J. Zhou 4Division of Epidemiology and Biostatistics, University of Arizona, Tucson, AZ 85724, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteEric Sobel 1Department of Computational Medicine, UCLA, Los Angeles, USA5Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJanet S. Sinsheimer 1Department of Computational Medicine, UCLA, Los Angeles, USA5Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Janet S. SinsheimerFor correspondence: jsinshei@ucla.eduKenneth Lange 1Department of Computational Medicine, UCLA, Los Angeles, USA5Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Kenneth LangeFor correspondence: klange@ucla.edu

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa044 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102262 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102263

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa057 Junhua Li 1BGI-Shenzhen, Shenzhen 518083, China2China National GeneBank, BGI-Shenzhen, Shenzhen 518120, China3School of Bioscience & Bioengineering, South China University of Technology, Guangzhou, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Junhua LiHuanzi Zhong 1BGI-Shenzhen, Shenzhen 518083, China2China National GeneBank, BGI-Shenzhen, Shenzhen 518120, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Huanzi ZhongYuliaxis Ramayo-Caldas 4INRA, Institut National de la Recherche Agronomique, Génétique Animale et Biologie Intégrative, AgroParisTech, Université Paris-Saclay, 78350, Jouy-en-Josas, France5Animal Breeding and Genetics Program, Institute for Research and Technology in Food and Agriculture (IRTA), Torre Marimon, Caldes de Montbui, 08140, SpainFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteNicolas Terrapon 6CNRS UMR 7257, Aix-Marseille University, 13288 Marseille, France7INRA, USC 1408 AFMB, 13288 Marseille, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteVincent Lombard 6CNRS UMR 7257, Aix-Marseille University, 13288 Marseille, France7INRA, USC 1408 AFMB, 13288 Marseille, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteGabrielle Potocki-Veronese 8LISBP, Universite de Toulouse, CNRS, INRA, INSA, 31077, Toulouse, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJordi Estellé 4INRA, Institut National de la Recherche Agronomique, Génétique Animale et Biologie Intégrative, AgroParisTech, Université Paris-Saclay, 78350, Jouy-en-Josas, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMilka Popova 9INRA, UMR Herbivores, Université Clermont Auvergne, VetAgro Sup, F-63122 Saint-Genès Champanelle, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteZiyi Yang 1BGI-Shenzhen, Shenzhen 518083, China2China National GeneBank, BGI-Shenzhen, Shenzhen 518120, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteHui Zhang 1BGI-Shenzhen, Shenzhen 518083, China2China National GeneBank, BGI-Shenzhen, Shenzhen 518120, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFang Li 1BGI-Shenzhen, Shenzhen 518083, China2China National GeneBank, BGI-Shenzhen, Shenzhen 518120, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteShanmei Tang 1BGI-Shenzhen, Shenzhen 518083, China2China National GeneBank, BGI-Shenzhen, Shenzhen 518120, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteWeineng Chen 1BGI-Shenzhen, Shenzhen 518083, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteBing Chen 1BGI-Shenzhen, Shenzhen 518083, China2China National GeneBank, BGI-Shenzhen, Shenzhen 518120, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJiyang Li 1BGI-Shenzhen, Shenzhen 518083, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJing Guo 1BGI-Shenzhen, Shenzhen 518083, China2China National GeneBank, BGI-Shenzhen, Shenzhen 518120, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteCécile Martin 9INRA, UMR Herbivores, Université Clermont Auvergne, VetAgro Sup, F-63122 Saint-Genès Champanelle, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteEmmanuelle Maguin 10INRA, Micalis Institute, AgroParisTech, Université Paris-Saclay, 78350, Jouy-en-Josas, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteXun Xu 1BGI-Shenzhen, Shenzhen 518083, China2China National GeneBank, BGI-Shenzhen, Shenzhen 518120, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteHuanming Yang 1BGI-Shenzhen, Shenzhen 518083, China11James D. Watson Institute of Genome Sciences, Hangzhou 310058, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJian Wang 1BGI-Shenzhen, Shenzhen 518083, China11James D. Watson Institute of Genome Sciences, Hangzhou 310058, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteLise Madsen 1BGI-Shenzhen, Shenzhen 518083, China12Institute of Marine Research (IMR), Postboks 1870 Nordnes, 5817 Bergen, Norway13Laboratory of Genomics and Molecular Biomedicine, Department of Biology, University of Copenhagen, 2100 Copenhagen Ø, DenmarkFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteKarsten Kristiansen 1BGI-Shenzhen, Shenzhen 518083, China13Laboratory of Genomics and Molecular Biomedicine, Department of Biology, University of Copenhagen, 2100 Copenhagen Ø, DenmarkFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: diego.morgavi@inra.fr stanislav.ehrlich@inra.fr kk@bio.ku.dkBernard Henrissat 6CNRS UMR 7257, Aix-Marseille University, 13288 Marseille, France7INRA, USC 1408 AFMB, 13288 Marseille, France14Department of Biological Sciences, King Abdulaziz University, Jeddah, Saudi ArabiaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteStanislav D. Ehrlich 15MGP MetaGenoPolis, INRA, Université Paris-Saclay, 78350 Jouy en Josas, France16Centre for Host Microbiome Interactions, Dental Institute, King’s College London, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: diego.morgavi@inra.fr stanislav.ehrlich@inra.fr kk@bio.ku.dkDiego P. Morgavi 9INRA, UMR Herbivores, Université Clermont Auvergne, VetAgro Sup, F-63122 Saint-Genès Champanelle, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Diego P. MorgaviFor correspondence: diego.morgavi@inra.fr stanislav.ehrlich@inra.fr kk@bio.ku.dk

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa057 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102258 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102259

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa052 Alessandro Petrini 1AnacletoLab - Dipartimento di Informatica, Università degli Studi di Milano, ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: alessandro.petrini@unimi.itMarco Mesiti 1AnacletoLab - Dipartimento di Informatica, Università degli Studi di Milano, ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMax Schubach 2Berlin Institute of Health (BIH), Berlin, Germany3Charité – Universitätsmedizin Berlin, Berlin, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMarco Frasca 1AnacletoLab - Dipartimento di Informatica, Università degli Studi di Milano, ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteDaniel Danis 4The Jackson Laboratory for Genomic Medicine, Farmington CT, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMatteo Re 1AnacletoLab - Dipartimento di Informatica, Università degli Studi di Milano, ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteGiuliano Grossi 1AnacletoLab - Dipartimento di Informatica, Università degli Studi di Milano, ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteLuca Cappelletti 1AnacletoLab - Dipartimento di Informatica, Università degli Studi di Milano, ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteTiziana Castrignanò 5CINECA, SCAI SuperComputing Applications and Innovation Department, Roma, ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this sitePeter N. Robinson 4The Jackson Laboratory for Genomic Medicine, Farmington CT, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Peter N. RobinsonGiorgio Valentini 1AnacletoLab - Dipartimento di Informatica, Università degli Studi di Milano, ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa052 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102239 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102240

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa050 Annarita Marrano 1Department of Plant Sciences, University of California, Davis, CA 95616, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: amarrano@ucdavis.eduMonica Britton 2Bioinformatics Core Facility, Genome Center, University of California Davis, CA 95616, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this sitePaulo A. Zaini 1Department of Plant Sciences, University of California, Davis, CA 95616, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAleksey V. Zimin 3Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteRachael E. Workman 3Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteDaniela Puiu 4Center for Computational Biology, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD 21205, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteLuca Bianco 5Research and Innovation Center, Department of Genomics and Biology of Fruit Crops, Fondazione E Mach, San Michele all’ Adige (TN) 38010, ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Luca BiancoErica Adele Di Pierro 5Research and Innovation Center, Department of Genomics and Biology of Fruit Crops, Fondazione E Mach, San Michele all’ Adige (TN) 38010, ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteBrian J. Allen 1Department of Plant Sciences, University of California, Davis, CA 95616, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteSandeep Chakraborty 1Department of Plant Sciences, University of California, Davis, CA 95616, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMichela Troggio 5Research and Innovation Center, Department of Genomics and Biology of Fruit Crops, Fondazione E Mach, San Michele all’ Adige (TN) 38010, ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteCharles A. Leslie 1Department of Plant Sciences, University of California, Davis, CA 95616, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteWinston Timp 3Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Winston TimpAbhaya Dandekar 1Department of Plant Sciences, University of California, Davis, CA 95616, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteSteven L. Salzberg 3Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205, USA4Center for Computational Biology, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD 21205, USA6Departments of Computer Science and Biostatistics, Johns Hopkins University, Baltimore, MD 21218Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteDavid B. Neale 1Department of Plant Sciences, University of California, Davis, CA 95616, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa050 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102235 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102236

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa048 Morteza Hosseini 1IEETA/DETI, University of Aveiro, 3810-193 Aveiro, PortugalFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: seyedmorteza@ua.ptDiogo Pratas 1IEETA/DETI, University of Aveiro, 3810-193 Aveiro, Portugal2Department of Virology, University of Helsinki, 00100 Helsinki, FinlandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteBurkhard Morgenstern 3Department of Bioinformatics, University of Göttingen, Goldschmidtstr. 1, 37077 Göttingen, Germany4Göttingen Center of Molecular Biosciences (GZMB), Justus-von-Liebig-Weg 11, 37077 Göttingen, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteArmando J. Pinho 1IEETA/DETI, University of Aveiro, 3810-193 Aveiro, PortugalFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa048 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102233 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102234

    1. Abstract

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa049 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102230 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102231 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.102232

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa045 Graham J Etherington The Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Graham J EtheringtonFor correspondence: graham.etherington@earlham.ac.uk Federica.dipalma@earlham.ac.ukDarren Heavens The Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Darren HeavensDavid Baker The Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAshleigh Lister The Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Ashleigh ListerRose McNelly The Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteGonzalo Garcia The Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Gonzalo GarciaBernardo Clavijo The Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Bernardo ClavijoIain Macaulay The Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Iain MacaulayWilfried Haerty The Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Wilfried HaertyFederica Di Palma The Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Federica Di PalmaFor correspondence: graham.etherington@earlham.ac.uk Federica.dipalma@earlham.ac.uk

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa045 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102226 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102227

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa041 Alejandra N. Gonzalez-Beltran 1Oxford e-Research Centre, Department of Engineering Science, University of Oxford, Oxford, UK2Scientific Computing Department, Rutherford Appleton Laboratory, Science and Technology Facilities Council, Didcot, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Alejandra N. Gonzalez-BeltranPaola Masuzzo 3VIB-UGent Center for Medical Biotechnology, Ghent, Belgium4Department of Biomolecular Medicine, Ghent University, Ghent, Belgium5Institute for Globally Distributed Open Research and Education (IGDORE), Ghent, BelgiumFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Paola MasuzzoChristophe Ampe 4Department of Biomolecular Medicine, Ghent University, Ghent, BelgiumFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Christophe AmpeGert-Jan Bakker 6Department of Cell Biology, Radboud Institute for Molecular Life Sciences, Nijmegen, The NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Gert-Jan BakkerSébastien Besson 7Centre for Gene Regulation & Expression & Division of Computational Biology, University of Dundee, Dundee, Scotland, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Sébastien BessonRobert H. Eibl 8German Cancer Research Center, DKFZ Alumni Association, Heidelberg, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Robert H. EiblPeter Friedl 6Department of Cell Biology, Radboud Institute for Molecular Life Sciences, Nijmegen, The Netherlands9David H. Koch Center for Applied Genitourinary Medicine, UT MD Anderson Cancer Center, Houston, TX, USA10Cancer Genomics Center, Utrecht, The NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Peter FriedlMatthias Gunzer 11Institute for Experimental Immunology and Imaging, University Hospital, University Duisburg-Essen, Essen, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Matthias GunzerMark Kittisopikul 12Department of Biophysics, UT Southwestern Medical Center, Dallas, TX, USA13Department of Cell and Developmental Biology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Mark KittisopikulSylvia E. Le Dévédec 14Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, the NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Sylvia E. Le DévédecSimone Leo 7Centre for Gene Regulation & Expression & Division of Computational Biology, University of Dundee, Dundee, Scotland, UK15Center for Advanced Studies, Research, and Development in Sardinia (CRS4), Pula(CA), ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Simone LeoJosh Moore 7Centre for Gene Regulation & Expression & Division of Computational Biology, University of Dundee, Dundee, Scotland, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Josh MooreYael Paran 16IDEA Bio-Medical Ltd, Rehovot, IsraelFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Yael ParanJaime Prilusky 17Life Science Core Facilities, Weizmann Institute of Science, Rehovot, IsraelFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Jaime PriluskyPhilippe Rocca-Serra 1Oxford e-Research Centre, Department of Engineering Science, University of Oxford, Oxford, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Philippe Rocca-SerraPhilippe Roudot 18Lyda Hill Department of Bioinformatics, UT Southwestern Medical Center, Dallas, TX, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Philippe RoudotMarc Schuster 11Institute for Experimental Immunology and Imaging, University Hospital, University Duisburg-Essen, Essen, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Marc SchusterGwendolien Sergeant 4Department of Biomolecular Medicine, Ghent University, Ghent, BelgiumFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Gwendolien SergeantStaffan Strömblad 19Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, SwedenFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Staffan StrömbladJason R. Swedlow 7Centre for Gene Regulation & Expression & Division of Computational Biology, University of Dundee, Dundee, Scotland, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Jason R. SwedlowMerijn van Erp 6Department of Cell Biology, Radboud Institute for Molecular Life Sciences, Nijmegen, The NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Merijn van ErpMarleen Van Troys 4Department of Biomolecular Medicine, Ghent University, Ghent, BelgiumFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Marleen Van TroysAssaf Zaritsky 20Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, IsraelFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Assaf ZaritskySusanna-Assunta Sansone 1Oxford e-Research Centre, Department of Engineering Science, University of Oxford, Oxford, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Susanna-Assunta SansoneFor correspondence: susanna-assunta.sansone@oerc.ox.ac.uk lennart.martens@vib-ugent.beLennart Martens 3VIB-UGent Center for Medical Biotechnology, Ghent, Belgium4Department of Biomolecular Medicine, Ghent University, Ghent, BelgiumFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Lennart MartensFor correspondence: susanna-assunta.sansone@oerc.ox.ac.uk lennart.martens@vib-ugent.be

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa041 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102224 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102225

    2. Now published in GigaScience doi: 10.1093/gigascience/giaa041 Alejandra N. Gonzalez-Beltran 1Oxford e-Research Centre, Department of Engineering Science, University of Oxford, Oxford, UK2Scientific Computing Department, Rutherford Appleton Laboratory, Science and Technology Facilities Council, Didcot, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Alejandra N. Gonzalez-BeltranPaola Masuzzo 3VIB-UGent Center for Medical Biotechnology, Ghent, Belgium4Department of Biomolecular Medicine, Ghent University, Ghent, Belgium5Institute for Globally Distributed Open Research and Education (IGDORE), Ghent, BelgiumFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Paola MasuzzoChristophe Ampe 4Department of Biomolecular Medicine, Ghent University, Ghent, BelgiumFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Christophe AmpeGert-Jan Bakker 6Department of Cell Biology, Radboud Institute for Molecular Life Sciences, Nijmegen, The NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Gert-Jan BakkerSébastien Besson 7Centre for Gene Regulation & Expression & Division of Computational Biology, University of Dundee, Dundee, Scotland, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Sébastien BessonRobert H. Eibl 8German Cancer Research Center, DKFZ Alumni Association, Heidelberg, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Robert H. EiblPeter Friedl 6Department of Cell Biology, Radboud Institute for Molecular Life Sciences, Nijmegen, The Netherlands9David H. Koch Center for Applied Genitourinary Medicine, UT MD Anderson Cancer Center, Houston, TX, USA10Cancer Genomics Center, Utrecht, The NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Peter FriedlMatthias Gunzer 11Institute for Experimental Immunology and Imaging, University Hospital, University Duisburg-Essen, Essen, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Matthias GunzerMark Kittisopikul 12Department of Biophysics, UT Southwestern Medical Center, Dallas, TX, USA13Department of Cell and Developmental Biology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Mark KittisopikulSylvia E. Le Dévédec 14Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, the NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Sylvia E. Le DévédecJosh Moore 7Centre for Gene Regulation & Expression & Division of Computational Biology, University of Dundee, Dundee, Scotland, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Josh MooreYael Paran 16IDEA Bio-Medical Ltd, Rehovot, IsraelFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Yael ParanJaime Prilusky 17Life Science Core Facilities, Weizmann Institute of Science, Rehovot, IsraelFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Jaime PriluskyPhilippe Rocca-Serra 1Oxford e-Research Centre, Department of Engineering Science, University of Oxford, Oxford, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Philippe Rocca-SerraPhilippe Roudot 18Lyda Hill Department of Bioinformatics, UT Southwestern Medical Center, Dallas, TX, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Philippe RoudotMarc Schuster 11Institute for Experimental Immunology and Imaging, University Hospital, University Duisburg-Essen, Essen, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Marc SchusterGwendolien Sergeant 4Department of Biomolecular Medicine, Ghent University, Ghent, BelgiumFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Gwendolien SergeantStaffan Strömblad 19Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, SwedenFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Staffan StrömbladJason R. Swedlow 7Centre for Gene Regulation & Expression & Division of Computational Biology, University of Dundee, Dundee, Scotland, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Jason R. SwedlowMerijn van Erp 6Department of Cell Biology, Radboud Institute for Molecular Life Sciences, Nijmegen, The NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Merijn van ErpMarleen Van Troys 4Department of Biomolecular Medicine, Ghent University, Ghent, BelgiumFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Marleen Van TroysAssaf Zaritsky 20Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, IsraelFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Assaf ZaritskySusanna-Assunta Sansone 1Oxford e-Research Centre, Department of Engineering Science, University of Oxford, Oxford, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Susanna-Assunta SansoneFor correspondence: susanna-assunta.sansone@oerc.ox.ac.uk lennart.martens@vib-ugent.beLennart Martens 3VIB-UGent Center for Medical Biotechnology, Ghent, Belgium4Department of Biomolecular Medicine, Ghent University, Ghent, BelgiumFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Lennart MartensFor correspondence: susanna-assunta.sansone@oerc.ox.ac.uk lennart.martens@vib-ugent.be

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa041 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102224 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102225

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa046 Rafael R. C. Cuadrat 1Molecular Epidemiology Department, German Institute of Human Nutrition Potsdam-Rehbruecke - DIfEFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Rafael R. C. CuadratMaria Sorokina 2Friedrich-Schiller University, Lessingstrasse 8, 07743 Jena, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Maria SorokinaBruno G. Andrade 3Embrapa Southeast Livestock - EMBRAPAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteTobias Goris 4Department of Molecular Toxicology, Research Group Intestinal Microbiology, German Institute of Human Nutrition Potsdam-Rehbruecke - DIfEFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAlberto M. R. Dávila 5Computational and Systems Biology Laboratory, Oswaldo Cruz Institute, FIOCRUZ. Av Brasil 4365, Rio de Janeiro, RJ, Brasil. 21040-900Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Alberto M. R. DávilaFor correspondence: alberto.davila@fiocruz.br

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa046 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102215 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102216 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.102217

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa037 Nathan J Kenny 1Natural History Museum, Department of Life Sciences, Cromwell Road, London SW7 5BD, UK2Oxford Brookes University, Headington Rd, Oxford OX3 0BP, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Nathan J KennyShane A McCarthy 3Department of Genetics, University of Cambridge, Cambridge, CB2 3EH, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteOlga Dudchenko 4The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA5The Center for Theoretical Biological Physics, Rice University, Houston, TX, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteKatherine James 1Natural History Museum, Department of Life Sciences, Cromwell Road, London SW7 5BD, UK6Department of Applied Sciences, Faculty of Health and Life Sciences, Northumbria University, Newcastle upon Tyne NE1 8ST UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteEmma Betteridge 7Wellcome Sanger Institute, Cambridge CB10 1SA, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteCraig Corton 7Wellcome Sanger Institute, Cambridge CB10 1SA, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJale Dolucan 7Wellcome Sanger Institute, Cambridge CB10 1SA, UK8Freeline Therapeutics Limited, Stevenage Bioscience Catalyst, Gunnels Wood Road, Stevenage, Hertfordshire, SG1 2FX, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteDan Mead 7Wellcome Sanger Institute, Cambridge CB10 1SA, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteKaren Oliver 7Wellcome Sanger Institute, Cambridge CB10 1SA, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteArina D Omer 4The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteSarah Pelan 7Wellcome Sanger Institute, Cambridge CB10 1SA, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteYan Ryan 9School of Computing, Newcastle University, Newcastle upon Tyne NE1 7RU, UK10Institute of Infection and Global Health, Liverpool University, iC2, 146 Brownlow Hill, L3 5RFFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteYing Sims 7Wellcome Sanger Institute, Cambridge CB10 1SA, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJason Skelton 7Wellcome Sanger Institute, Cambridge CB10 1SA, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMichelle Smith 7Wellcome Sanger Institute, Cambridge CB10 1SA, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJames Torrance 7Wellcome Sanger Institute, Cambridge CB10 1SA, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteDavid Weisz 4The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAnil Wipat 9School of Computing, Newcastle University, Newcastle upon Tyne NE1 7RU, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteErez L Aiden 4The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA5The Center for Theoretical Biological Physics, Rice University, Houston, TX, USA11Shanghai Institute for Advanced Immunochemical Studies, ShanghaiTech University, Shanghai, China12School of Agriculture and Environment, University of Western Australia, Perth, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteKerstin Howe 7Wellcome Sanger Institute, Cambridge CB10 1SA, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteSuzanne T Williams 1Natural History Museum, Department of Life Sciences, Cromwell Road, London SW7 5BD, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Suzanne T WilliamsFor correspondence: s.williams@nhm.ac.uk

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa037 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102199 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102200 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.102201

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa030 Mitchell J. Feldmann 1Department of Plant Sciences, University of California, Davis. One Shields Ave, Davis, CA 95616, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Mitchell J. FeldmannFor correspondence: mjfeldmann@ucdavis.eduMichael A. Hardigan 1Department of Plant Sciences, University of California, Davis. One Shields Ave, Davis, CA 95616, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteRandi A. Famula 1Department of Plant Sciences, University of California, Davis. One Shields Ave, Davis, CA 95616, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Randi A. FamulaCindy M. López 1Department of Plant Sciences, University of California, Davis. One Shields Ave, Davis, CA 95616, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAmy Tabb 2USDA-ARS-AFRS, 2217 Wiltshire Rd, Kearneysville, WV 25430, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Amy TabbGlenn S. Cole 1Department of Plant Sciences, University of California, Davis. One Shields Ave, Davis, CA 95616, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteSteven J. Knapp 1Department of Plant Sciences, University of California, Davis. One Shields Ave, Davis, CA 95616, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Steven J. Knapp

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa030 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102204 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102205 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.102208

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa035 Karl Johnson 1UCCS center for the Biofrontiers Institute, University of Colorado at Colorado Springs 1420 Austin Bluffs Parkway, Colorado Springs, Colorado, 80918, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteGuy M. Hagen 1UCCS center for the Biofrontiers Institute, University of Colorado at Colorado Springs 1420 Austin Bluffs Parkway, Colorado Springs, Colorado, 80918, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Guy M. HagenFor correspondence: ghagen@uccs.edu

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa035 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102185 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102186 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.102187

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa033 Marco Antonio Tangaro 1Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR), Via Amendola 165/A, 70126 Bari, ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Marco Antonio TangaroGiacinto Donvito 2National Institute for Nuclear Physics (INFN), Section of Bari, Via Orabona 4, 70126 Bari, ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Giacinto DonvitoMarica Antonacci 2National Institute for Nuclear Physics (INFN), Section of Bari, Via Orabona 4, 70126 Bari, ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Marica AntonacciMatteo Chiara 3Department of Biosciences, University of Milan, via Celoria 26, 20133 Milano, ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Matteo ChiaraPietro Mandreoli 3Department of Biosciences, University of Milan, via Celoria 26, 20133 Milano, ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteGraziano Pesole 1Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR), Via Amendola 165/A, 70126 Bari, Italy4Department of Biosciences, Biotechnologies and Biopharmaceutics, University of Bari, Via Orabona 4, 70126 Bari, ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Graziano PesoleFederico Zambelli 1Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR), Via Amendola 165/A, 70126 Bari, Italy3Department of Biosciences, University of Milan, via Celoria 26, 20133 Milano, ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Federico Zambelli

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa033 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102183 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102184

    1. A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa026 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102179 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102180 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.102182

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa028 Olexiy Kyrgyzov 1CEA Genoscope, Evry, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteVincent Prost 1CEA Genoscope, Evry, France2CEA LIST, Gif-sur-Yvette, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteStéphane Gazut 2CEA LIST, Gif-sur-Yvette, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteBruno Farcy 3Bull Technologies, Les Clayes-sous-Bois, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteThomas Brüls 1CEA Genoscope, Evry, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: bruls@genoscope.cns.fr

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa028 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102154 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102155

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa025 Thomas McGowan 1Minnesota Supercomputing Institute, University of Minnesota, 599 Walter Library, 117 Pleasant Street SE, Minneapolis, MN 55455Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteJames E. Johnson 1Minnesota Supercomputing Institute, University of Minnesota, 599 Walter Library, 117 Pleasant Street SE, Minneapolis, MN 55455Find this author on Google ScholarFind this author on PubMedSearch for this author on this sitePraveen Kumar 2Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 6-155 Jackson Hall, 321 Church St SE, Minneapolis, MN, 554553Bioinformatics and Computational Biology program, University of Minnesota-Rochester, 111 South Broadway, Suite 300, Rochester, MN 55904Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteRay Sajulga 2Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 6-155 Jackson Hall, 321 Church St SE, Minneapolis, MN, 55455Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteSubina Mehta 2Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 6-155 Jackson Hall, 321 Church St SE, Minneapolis, MN, 55455Find this author on Google ScholarFind this author on PubMedSearch for this author on this sitePratik D. Jagtap 2Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 6-155 Jackson Hall, 321 Church St SE, Minneapolis, MN, 55455Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteTimothy J. Griffin 2Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 6-155 Jackson Hall, 321 Church St SE, Minneapolis, MN, 55455Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: tgriffin@umn.edu

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa025 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102152 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102153

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa010 Edi Prifti 1Institute of Cardiometabolism and Nutrition, Integromics, ICAN, Paris, France2Sorbonne University, IRD, UMMISCO, UMI 209, Paris, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Edi PriftiYann Chevaleyre 3Paris-Dauphine University, PSL Research University, CNRS, UMR 7243, LAMSADE, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Yann ChevaleyreBlaise Hanczar 4IBISC, University Paris-Saclay, University Evry, Evry, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Blaise HanczarEugeni Belda 1Institute of Cardiometabolism and Nutrition, Integromics, ICAN, Paris, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Eugeni BeldaAntoine Danchin 5Institute of Cardiometabolism and Nutrition, ICAN, Paris, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Antoine DanchinKarine Clément 6Sorbonne University, INSERM, Nutriomics team, Paris, France7Assistance Publique-Hôpitaux de Paris, Nutrition department, CRNH Ile de France, Paris, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Karine ClémentJean-Daniel Zucker 1Institute of Cardiometabolism and Nutrition, Integromics, ICAN, Paris, France2Sorbonne University, IRD, UMMISCO, UMI 209, Paris, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Jean-Daniel Zucker

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa010 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102131 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102132

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa005 Ekaterina Noskova 1ITMO University, St. Petersburg, Russia2JetBrains Research, St. Petersburg, RussiaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: ekaterina.e.noskova@gmail.comVladimir Ulyantsev 1ITMO University, St. Petersburg, RussiaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteKlaus-Peter Koepfli 3Theodosius Dobzhansky Center for Genome Bioinformatics, Saint Petersburg State University, St. Petersburg, Russia5National Zoological Park, Smithsonian Conservation Biology Institute, Washington DC, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteStephen J. O’Brien 3Theodosius Dobzhansky Center for Genome Bioinformatics, Saint Petersburg State University, St. Petersburg, Russia4Oceanographic Center, Nova Southeastern University Ft Lauderdale, Ft Lauderdale, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this sitePavel Dobrynin 3Theodosius Dobzhansky Center for Genome Bioinformatics, Saint Petersburg State University, St. Petersburg, Russia5National Zoological Park, Smithsonian Conservation Biology Institute, Washington DC, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa005 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102116 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102117

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa004 Zheng Li Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Zheng LiMichael S Barker Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Michael S Barker

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa004 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102092 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102093

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa003 Jerven Bolleman 1Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, SwitzerlandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Jerven BollemanFor correspondence: jerven.bolleman@sib.swissEduoard de Castro 1Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, SwitzerlandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Eduoard de CastroDelphine Baratin 1Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, SwitzerlandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Delphine BaratinSebastien Gehant 1Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, SwitzerlandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Sebastien GehantBeatrice A. Cuche 1Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, SwitzerlandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Beatrice A. CucheAndrea H. Auchincloss 1Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, SwitzerlandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Andrea H. AuchinclossElisabeth Coudert 1Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, SwitzerlandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Elisabeth CoudertChantal Hulo 1Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, SwitzerlandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Chantal HuloPatrick Masson 1Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, SwitzerlandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Patrick MassonIvo Pedruzzi 1Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, SwitzerlandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Ivo PedruzziCatherine Rivoire 1Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, SwitzerlandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Catherine RivoireIoannis Xenarios 1Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, Switzerland2CHUV/LICR, Agora Centre, CH-1005 Lausanne, SwitzerlandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Ioannis XenariosNicole Redaschi 1Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, SwitzerlandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Nicole RedaschiAlan Bridge 1Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH-1211 Geneva 4, SwitzerlandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Alan Bridge

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa003 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102090 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102091

    1. Now published in GigaScience doi: 10.1093/gigascience/giz165 George Alter 1University of MichiganFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for George AlterAlejandra Gonzalez-Beltran 2University of OxfordFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Alejandra Gonzalez-BeltranLucila Ohno-Machado 3University of California at San DiegoFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Lucila Ohno-MachadoPhilippe Rocca-Serra 4University of OxfordFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Philippe Rocca-Serra

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz165 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102087 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102088 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.102089

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa007 Stephen J. Bush 1Nuffield Department of Medicine, University of Oxford, Oxford, UK2National Institute for Health Research Health Research Protection Unit in Healthcare Associated Infections and Antimicrobial Resistance at University of Oxford in partnership with Public Health England, Oxford, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Stephen J. BushFor correspondence: stephen.bush@roslin.ed.ac.ukDona Foster 1Nuffield Department of Medicine, University of Oxford, Oxford, UK3National Institute for Health Research Oxford Biomedical Research Centre, Oxford, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteDavid W. Eyre 1Nuffield Department of Medicine, University of Oxford, Oxford, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for David W. EyreEmily L. Clark 4The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Emily L. ClarkNicola De Maio 1Nuffield Department of Medicine, University of Oxford, Oxford, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteLiam P. Shaw 1Nuffield Department of Medicine, University of Oxford, Oxford, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Liam P. ShawNicole Stoesser 1Nuffield Department of Medicine, University of Oxford, Oxford, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Nicole StoesserTim E. A. Peto 1Nuffield Department of Medicine, University of Oxford, Oxford, UK2National Institute for Health Research Health Research Protection Unit in Healthcare Associated Infections and Antimicrobial Resistance at University of Oxford in partnership with Public Health England, Oxford, UK3National Institute for Health Research Oxford Biomedical Research Centre, Oxford, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteDerrick W. Crook 1Nuffield Department of Medicine, University of Oxford, Oxford, UK2National Institute for Health Research Health Research Protection Unit in Healthcare Associated Infections and Antimicrobial Resistance at University of Oxford in partnership with Public Health England, Oxford, UK3National Institute for Health Research Oxford Biomedical Research Centre, Oxford, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteA. Sarah Walker 1Nuffield Department of Medicine, University of Oxford, Oxford, UK2National Institute for Health Research Health Research Protection Unit in Healthcare Associated Infections and Antimicrobial Resistance at University of Oxford in partnership with Public Health England, Oxford, UK3National Institute for Health Research Oxford Biomedical Research Centre, Oxford, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa007 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102084 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102085

    1. Now published in GigaScience doi: 10.1093/gigascience/giaa002 Miranda E. Pitt 1Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, 4072, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Miranda E. PittFor correspondence: miranda.pitt@imb.uq.edu.au l.coin@imb.uq.edu.auSon H. Nguyen 1Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, 4072, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Son H. NguyenTânia P.S. Duarte 1Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, 4072, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Tânia P.S. DuarteMark A.T. Blaskovich 1Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, 4072, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Mark A.T. BlaskovichLachlan J.M. Coin 1Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, 4072, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Lachlan J.M. CoinFor correspondence: miranda.pitt@imb.uq.edu.au l.coin@imb.uq.edu.au

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa002 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102080 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102081 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.102082

    1. Now published in GigaScience doi: 10.1093/gigascience/giz158 Mitsutaka Kadota 1Laboratory for Phyloinformatics, RIKEN Center for Biosystems Dynamics Research (BDR), Kobe, 650-0047, JapanFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Mitsutaka KadotaHisashi Miura 2Laboratory for Developmental Epigenetics, RIKEN BDR, Kobe, 650-0047, JapanFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Hisashi MiuraKaori Tanaka 1Laboratory for Phyloinformatics, RIKEN Center for Biosystems Dynamics Research (BDR), Kobe, 650-0047, Japan3Division of Transcriptomics, Medical Institute of Bioregulation, Kyushu University, Fukuoka, 812-0054, JapanFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteIchiro Hiratani 2Laboratory for Developmental Epigenetics, RIKEN BDR, Kobe, 650-0047, JapanFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Ichiro Hiratani

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz158 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102051 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102052 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.102053

    1. Now published in GigaScience doi: 10.1093/gigascience/giz160 Weiwen Wang 1Research School of Biology, the Australian National University, Canberra, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Weiwen WangFor correspondence: wei.wang@anu.edu.au rob.lanfear@anu.edu.auAshutosh Das 1Research School of Biology, the Australian National University, Canberra, Australia2Department of Genetics and Animal Breeding, Faculty of Veterinary Medicine, Chittagong Veterinary and Animal Sciences University, Chittagong, BangladeshFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Ashutosh DasDavid Kainer 1Research School of Biology, the Australian National University, Canberra, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for David KainerMiriam Schalamun 1Research School of Biology, the Australian National University, Canberra, Australia3Institute of Applied Genetics and Cell Biology, University of Natural Resources and Life Sciences, Vienna, AustriaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Miriam SchalamunAlejandro Morales-Suarez 4Department of Biological Sciences, Macquarie University, Sydney, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Alejandro Morales-SuarezBenjamin Schwessinger 1Research School of Biology, the Australian National University, Canberra, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Benjamin Schwessinger

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz160 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102038 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102039

    1. Now published in GigaScience doi: 10.1093/gigascience/giz141 Lu Zhang 1Department of Computer Science, Hong Kong Baptist University2Department of Pathology, Stanford University3Department of Computer Science, Stanford UniversityFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteXin Zhou 3Department of Computer Science, Stanford UniversityFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteZiming Weng 2Department of Pathology, Stanford UniversityFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteArend Sidow 2Department of Pathology, Stanford University4Department of Genetics, Stanford UniversityFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: arend@stanford.edu

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz141 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101987 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101988

    1. ABSTRACT

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz120 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101977 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101978

    1. Now published in GigaScience doi: 10.1093/gigascience/giz132 Ekaterina Osipova 1Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany2Max Planck Institute for the Physics of Complex Systems, Dresden, Germany3Center for Systems Biology Dresden, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteNikolai Hecker 1Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany2Max Planck Institute for the Physics of Complex Systems, Dresden, Germany3Center for Systems Biology Dresden, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMichael Hiller 1Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany2Max Planck Institute for the Physics of Complex Systems, Dresden, Germany3Center for Systems Biology Dresden, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: hiller@mpi-cbg.de

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz132 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101980 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101981

    1. Now published in GigaScience doi: 10.1093/gigascience/giz136 Joeri S. Strijk 1State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Forestry, Guangxi University, Nanning, Guangxi 530005, China2Biodiversity Genomics Team, Plant Ecophysiology & Evolution Group, Guangxi Key Laboratory of Forest Ecology and Conservation, College of Forestry, Daxuedonglu 100, Nanning, Guangxi, 530005, China3Alliance for Conservation Tree Genomics, Pha Tad Ke Botanical Garden, PO Box 959, 06000 Luang Prabang, LaosFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Joeri S. StrijkFor correspondence: jsstrijk@hotmail.comDamien D. Hinsinger 2Biodiversity Genomics Team, Plant Ecophysiology & Evolution Group, Guangxi Key Laboratory of Forest Ecology and Conservation, College of Forestry, Daxuedonglu 100, Nanning, Guangxi, 530005, China3Alliance for Conservation Tree Genomics, Pha Tad Ke Botanical Garden, PO Box 959, 06000 Luang Prabang, LaosFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Damien D. HinsingerFeng-Ping Zhang 4Evolutionary Ecology of Plant Reproductive Systems Group, Kunming Institute of Botany, Kunming, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteKunFang Cao 1State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, College of Forestry, Guangxi University, Nanning, Guangxi 530005, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz136 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101974 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101975

    1. Now published in GigaScience doi: 10.1093/gigascience/giz123 Robail Yasrab 1School of Computer Science, University of Nottingham, NG8 1BB, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Robail YasrabJonathan A Atkinson 2School of Biosciences, University of Nottingham, LE12 5RD, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Jonathan A AtkinsonDarren M Wells 2School of Biosciences, University of Nottingham, LE12 5RD, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Darren M WellsAndrew P French 1School of Computer Science, University of Nottingham, NG8 1BB, UK2School of Biosciences, University of Nottingham, LE12 5RD, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Andrew P FrenchTony P Pridmore 1School of Computer Science, University of Nottingham, NG8 1BB, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Tony P PridmoreMichael P Pound 1School of Computer Science, University of Nottingham, NG8 1BB, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Michael P PoundFor correspondence: michael.pound@nottingham.ac.uk

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz123 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101967 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101968 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101969

    1. Now published in GigaScience doi: 10.1093/gigascience/giz110 Varuna Chander 1Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston TX 77030Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteRichard A. Gibbs 1Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston TX 77030Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFritz J. Sedlazeck 1Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston TX 77030Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Fritz J. SedlazeckFor correspondence: fritz.sedlazeck@bcm.edu

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz110 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101890 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101891

    1. Now published in GigaScience doi: 10.1093/gigascience/giz087 Anne Senabouth 1Institute for Molecular Bioscience, University of Queensland, Brisbane, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteSamuel W Lukowski 1Institute for Molecular Bioscience, University of Queensland, Brisbane, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJose Alquicira Hernandez 1Institute for Molecular Bioscience, University of Queensland, Brisbane, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteStacey Andersen 1Institute for Molecular Bioscience, University of Queensland, Brisbane, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteXin Mei 2South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteQuan H Nguyen 1Institute for Molecular Bioscience, University of Queensland, Brisbane, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJoseph E Powell 1Institute for Molecular Bioscience, University of Queensland, Brisbane, Australia3Queensland Brain Institute, University of Queensland, Brisbane, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz087 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101872

    1. Now published in GigaScience doi: 10.1093/gigascience/giz074 Timothy H. Webster 1School of Life Sciences, Arizona State UniversityFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Timothy H. WebsterMadeline Couse 2Child and Family Research Institute, University of British ColumbiaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteBruno M. Grande 3Department of Molecular Biology and Biochemistry, Simon Fraser UniversityFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Bruno M. GrandeEric Karlins 4Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of HealthFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteTanya N. Phung 5Interdepartmental Program in Bioinformatics, UCLAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Tanya N. PhungPhillip A. Richmond 6Centre for Molecular Medicine and Therapeutics, University of British Columbia7BC Children’s HospitalFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Phillip A. RichmondWhitney Whitford 8School of Biological Sciences, The University of AucklandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Whitney WhitfordMelissa A. Wilson Sayres 1School of Life Sciences, Arizona State University9Center for Evolution and Medicine, Arizona State UniversityFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Melissa A. Wilson Sayres

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz074 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101812 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101813

    1. Now published in GigaScience doi: 10.1093/gigascience/giz084 Michael Kotliar 1Division of Allergy and Immunology, Cincinnati Children’s Hospital Medical Center and Department of Pediatrics, College of Medicine, University of Cincinnati, Cincinnati, OHFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Michael KotliarAndrey V. Kartashov 1Division of Allergy and Immunology, Cincinnati Children’s Hospital Medical Center and Department of Pediatrics, College of Medicine, University of Cincinnati, Cincinnati, OHFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Andrey V. KartashovArtem Barski 1Division of Allergy and Immunology, Cincinnati Children’s Hospital Medical Center and Department of Pediatrics, College of Medicine, University of Cincinnati, Cincinnati, OH2Division of Human Genetics, Cincinnati Children’s Hospital Medical Center and Department of Pediatrics, College of Medicine, University of Cincinnati, Cincinnati, OHFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Artem BarskiFor correspondence: Artem.Barski@cchmc.org

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz084 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101838 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101839

    1. Now published in GigaScience doi: 10.1093/gigascience/giz063 Jay Ghurye 1Department of Computer Science, University of Maryland, College Park, MD2Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institute of Health, Bethesda, MDFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteSergey Koren 2Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institute of Health, Bethesda, MDFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteScott T. Small 3Department of Biological Sciences, University of Notre Dame, South Bend, INFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteSeth Redmond 4Infectious Disease and Microbiome Program, Broad Institute, Cambridge, MA5Department of Immunology and Infectious Disease, Harvard TH Chan School of Public Health, Boston, MAFind this author on Google ScholarFind this author on PubMedSearch for this author on this sitePaul Howell 6Centers for Disease Control, Atlanta, GAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAdam M. Phillippy 2Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institute of Health, Bethesda, MDFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteNora J. Besansky 3Department of Biological Sciences, University of Notre Dame, South Bend, INFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz063 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101744 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101745

    1. Now published in GigaScience doi: 10.1093/gigascience/giz037 Mathieu Bourgey 1Canadian Centre for Computational Genomics, Montréal, QC, Canada.2McGill University and Genome Québec Innovation Center, Montréal, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: guil.bourque@mcgill.ca mathieu.bourgey@mcgill.caRola Dali 1Canadian Centre for Computational Genomics, Montréal, QC, Canada.2McGill University and Genome Québec Innovation Center, Montréal, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteRobert Eveleigh 1Canadian Centre for Computational Genomics, Montréal, QC, Canada.2McGill University and Genome Québec Innovation Center, Montréal, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteKuang Chung Chen 3McGill HPC Centre, McGill University, Montréal, QC, Canada.4Calcul Québec, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteLouis Letourneau 1Canadian Centre for Computational Genomics, Montréal, QC, Canada.2McGill University and Genome Québec Innovation Center, Montréal, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteJoel Fillon 5Department of Human Genetics, McGill University, Montréal, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteMarc Michaud 2McGill University and Genome Québec Innovation Center, Montréal, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteMaxime Caron 1Canadian Centre for Computational Genomics, Montréal, QC, Canada.2McGill University and Genome Québec Innovation Center, Montréal, QC, Canada.5Department of Human Genetics, McGill University, Montréal, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteJohanna Sandoval 6Beaulieu-Saucier Université de Montréal Pharmacogenomics Centre, Montréal, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFrancois Lefebvre 1Canadian Centre for Computational Genomics, Montréal, QC, Canada.2McGill University and Genome Québec Innovation Center, Montréal, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteGary Leveque 1Canadian Centre for Computational Genomics, Montréal, QC, Canada.2McGill University and Genome Québec Innovation Center, Montréal, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteEloi Mercier 1Canadian Centre for Computational Genomics, Montréal, QC, Canada.2McGill University and Genome Québec Innovation Center, Montréal, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteDavid Bujold 1Canadian Centre for Computational Genomics, Montréal, QC, Canada.2McGill University and Genome Québec Innovation Center, Montréal, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this sitePascale Marquis 1Canadian Centre for Computational Genomics, Montréal, QC, Canada.2McGill University and Genome Québec Innovation Center, Montréal, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this sitePatrick Tran Van 7Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteDavid Morais 8Centre de calcul scientifique (ccs) - Université de Sherbrooke, Sherbrooke, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteJulien Tremblay 9Energy, Mining and Environment, National Research Council Canada, Montréal, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteXiaojian Shao 1Canadian Centre for Computational Genomics, Montréal, QC, Canada.2McGill University and Genome Québec Innovation Center, Montréal, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteEdouard Henrion 1Canadian Centre for Computational Genomics, Montréal, QC, Canada.2McGill University and Genome Québec Innovation Center, Montréal, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteEmmanuel Gonzalez 1Canadian Centre for Computational Genomics, Montréal, QC, Canada.2McGill University and Genome Québec Innovation Center, Montréal, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this sitePierre-Olivier Quirion 1Canadian Centre for Computational Genomics, Montréal, QC, Canada.2McGill University and Genome Québec Innovation Center, Montréal, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteBryan Caron 3McGill HPC Centre, McGill University, Montréal, QC, Canada.4Calcul Québec, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteGuillaume Bourque 1Canadian Centre for Computational Genomics, Montréal, QC, Canada.2McGill University and Genome Québec Innovation Center, Montréal, QC, Canada.5Department of Human Genetics, McGill University, Montréal, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: guil.bourque@mcgill.ca mathieu.bourgey@mcgill.ca

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz037 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101755 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101756 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101759

    1. Now published in GigaScience doi: 10.1093/gigascience/giz064 Amartya Singh 1Department of Physics and Astronomy, Rutgers University, Piscataway, NJ, USA2Center for Systems and Computational Biology, Rutgers Cancer Institute, Rutgers University, New Brunswick, NJ, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteGyan Bhanot 1Department of Physics and Astronomy, Rutgers University, Piscataway, NJ, USA2Center for Systems and Computational Biology, Rutgers Cancer Institute, Rutgers University, New Brunswick, NJ, USA3Department of Molecular Biology and Biochemistry, Rutgers University, Piscataway, NJ, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteHossein Khiabanian 1Department of Physics and Astronomy, Rutgers University, Piscataway, NJ, USA2Center for Systems and Computational Biology, Rutgers Cancer Institute, Rutgers University, New Brunswick, NJ, USA3Department of Molecular Biology and Biochemistry, Rutgers University, Piscataway, NJ, USA4Department of Pathology and Laboratory Medicine, Rutgers Robert Wood Johnson Medical School, Rutgers University, New Brunswick, NJ, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Hossein Khiabanian

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz064 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101761 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101762 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101763

    1. Now published in GigaScience doi: 10.1093/gigascience/giz077 Lu Gan Department of Biological Sciences and Centre for BioImaging Sciences, National University of Singapore, Singapore 117543Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Lu GanFor correspondence: lu@anaphase.orgCai Tong Ng Department of Biological Sciences and Centre for BioImaging Sciences, National University of Singapore, Singapore 117543Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Cai Tong NgChen Chen Department of Biological Sciences and Centre for BioImaging Sciences, National University of Singapore, Singapore 117543Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteShujun Cai Department of Biological Sciences and Centre for BioImaging Sciences, National University of Singapore, Singapore 117543Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Shujun Cai

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz077 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101801 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101802

    1. Now published in GigaScience doi: 10.1093/gigascience/giz047 Yi Zhao 1BGI-Shenzhen, Shenzhen 518083, China3School of Bioscience and Bioengineering, South China University of Technology, Guangzhou 510006, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteXiao Li 1BGI-Shenzhen, Shenzhen 518083, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJingwan Wang 1BGI-Shenzhen, Shenzhen 518083, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteZiyun Wan 1BGI-Shenzhen, Shenzhen 518083, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteKai Gao 1BGI-Shenzhen, Shenzhen 518083, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteGang Yi 4Shanghai Institute of Immunology, Shanghai JiaoTong University School of Medicine, Shanghai 200025, China; Department of Immunology and Microbiology, Shanghai JiaoTong University School of Medicine, Shanghai 200025, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteXie Wang 1BGI-Shenzhen, Shenzhen 518083, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJinghua Wu 1BGI-Shenzhen, Shenzhen 518083, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteBingbing Fan 2Shenzhen Second People's Hospital, First Affiliated Hospital of Shenzhen University, Shenzhen 518035, Guangdong Province, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteWei Zhang 1BGI-Shenzhen, Shenzhen 518083, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFang Chen 1BGI-Shenzhen, Shenzhen 518083, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteHuanming Yang 1BGI-Shenzhen, Shenzhen 518083, China5James D. Watson Institute of Genome Sciences, Hangzhou 310058, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJian Wang 1BGI-Shenzhen, Shenzhen 518083, China5James D. Watson Institute of Genome Sciences, Hangzhou 310058, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteXun Xu 1BGI-Shenzhen, Shenzhen 518083, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteBin Li 1BGI-Shenzhen, Shenzhen 518083, China4Shanghai Institute of Immunology, Shanghai JiaoTong University School of Medicine, Shanghai 200025, China; Department of Immunology and Microbiology, Shanghai JiaoTong University School of Medicine, Shanghai 200025, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteShiping Liu 1BGI-Shenzhen, Shenzhen 518083, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteWeihua Zhao 2Shenzhen Second People's Hospital, First Affiliated Hospital of Shenzhen University, Shenzhen 518035, Guangdong Province, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteYong Flou 1BGI-Shenzhen, Shenzhen 518083, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteXiao Liu 1BGI-Shenzhen, Shenzhen 518083, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz047 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101675 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101676 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101677

    1. Now published in GigaScience doi: 10.1093/gigascience/giz055 Angela Tam 1Centre de Recherche de l’Institut Universitaire de Gériatrie de Montréal, Montréal, CA2Douglas Hospital Research Centre, McGill University, Montréal, CA3McGill University, Montréal, CAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Angela TamFor correspondence: angela.tam@mail.mcgill.ca pierre.bellec@criugm.qc.caChristian Dansereau 1Centre de Recherche de l’Institut Universitaire de Gériatrie de Montréal, Montréal, CA4Département d’Informatique et de recherche opérationnelle, Université de Montréal, Montréal, CAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteYasser Itturia-Medina 3McGill University, Montréal, CAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteSebastian Urchs 3McGill University, Montréal, CAFind this author on Google ScholarFind this author on PubMedSearch for this author on this sitePierre Orban 1Centre de Recherche de l’Institut Universitaire de Gériatrie de Montréal, Montréal, CA5Centre de Recherche de l’Institut Universitaire en Santé Mentale de Montréal, Montréal, CA6Département de Psychiatrie, Université de Montréal, Montréal, CAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteHanad Sharmarke 1Centre de Recherche de l’Institut Universitaire de Gériatrie de Montréal, Montréal, CAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJohn Breitner 2Douglas Hospital Research Centre, McGill University, Montréal, CA3McGill University, Montréal, CAFind this author on Google ScholarFind this author on PubMedSearch for this author on this sitePierre Bellec 1Centre de Recherche de l’Institut Universitaire de Gériatrie de Montréal, Montréal, CA4Département d’Informatique et de recherche opérationnelle, Université de Montréal, Montréal, CAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: angela.tam@mail.mcgill.ca pierre.bellec@criugm.qc.ca

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz055 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101681 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101682 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101683

    1. Now published in GigaScience doi: 10.1093/gigascience/giz043 Samuel M. Nicholls 1Institute of Microbiology and Infection, School of Biosciences, University of Birmingham, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Samuel M. NichollsJoshua C. Quick 1Institute of Microbiology and Infection, School of Biosciences, University of Birmingham, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Joshua C. QuickShuiquan Tang 2Zymo Research Corporation, Irvine, California, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteNicholas J. Loman 1Institute of Microbiology and Infection, School of Biosciences, University of Birmingham, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Nicholas J. Loman

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz043 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101702 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101703

    1. Now published in GigaScience doi: 10.1093/gigascience/giz028 Venice Juanillas 1International Rice Research Institute, Manila, PhilippinesFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAlexis Dereeper 2Institut de recherche pour le développement (IRD), University of Montpellier, DIADE, IPME, Montpellier, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Alexis DereeperNicolas Beaume 1International Rice Research Institute, Manila, PhilippinesFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Nicolas BeaumeGaetan Droc 3CIRAD, UMR AGAP, F-34398 Montpellier, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Gaetan DrocJoshua Dizon 1International Rice Research Institute, Manila, PhilippinesFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJohn Robert Mendoza 8Advanced Science and Technology Institute, Department of Science and Technology, Quezon City, PhilippinesFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJon Peter Perdon 8Advanced Science and Technology Institute, Department of Science and Technology, Quezon City, PhilippinesFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteLocedie Mansueto 1International Rice Research Institute, Manila, PhilippinesFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteLindsay Triplett 7Department of Bioagricultural Sciences and Pest Management, Colorado State University, Fort Collins, CO 80523-1177Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Lindsay TriplettJillian Lang 7Department of Bioagricultural Sciences and Pest Management, Colorado State University, Fort Collins, CO 80523-1177Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Jillian LangGabriel Zhou 4Indiana University, 107 S Indiana Ave, Bloomington, IN 47405, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteKunalan Ratharanjan 4Indiana University, 107 S Indiana Ave, Bloomington, IN 47405, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Kunalan RatharanjanBeth Plale 4Indiana University, 107 S Indiana Ave, Bloomington, IN 47405, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Beth PlaleJason Haga 5National Institute of Advanced Industrial Science and Technology, AIST Tsukuba Central 1,1-1-1 Umezono, Tsukuba, Ibaraki 305-8560 JAPANFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Jason HagaJan E. Leach 7Department of Bioagricultural Sciences and Pest Management, Colorado State University, Fort Collins, CO 80523-1177Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Jan E. LeachManuel Ruiz 3CIRAD, UMR AGAP, F-34398 Montpellier, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMichael Thomson 1International Rice Research Institute, Manila, Philippines6Department of Soil and Crop Sciences, Texas A&M University, Houston, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteNickolai Alexandrov 1International Rice Research Institute, Manila, Philippines10Inari Agriculture Inc., 200 Sidney St, Cambridge, Massachusetts, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this sitePierre Larmande 2Institut de recherche pour le développement (IRD), University of Montpellier, DIADE, IPME, Montpellier, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Pierre LarmandeTobias Kretzschmar 1International Rice Research Institute, Manila, Philippines9Southern Cross Plant Science, Southern Cross University, Lismore, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteRamil P. Mauleon 1International Rice Research Institute, Manila, PhilippinesFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Ramil P. Mauleon

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz028 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101705 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101706

    1. Now published in GigaScience doi: 10.1093/gigascience/giz059 Bruno Louro 1CCMAR Centre of Marine Sciences, University of Algarve, Campus de Gambelas, 8005-139 Faro, PortugalFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteGianluca De Moro 1CCMAR Centre of Marine Sciences, University of Algarve, Campus de Gambelas, 8005-139 Faro, PortugalFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteCarlos Garcia 1CCMAR Centre of Marine Sciences, University of Algarve, Campus de Gambelas, 8005-139 Faro, PortugalFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteCymon J. Cox 1CCMAR Centre of Marine Sciences, University of Algarve, Campus de Gambelas, 8005-139 Faro, PortugalFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAna Veríssimo 2CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO, Laboratório Associado, Universidade do Porto, Vairão, PortugalFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteStephen J. Sabatino 2CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO, Laboratório Associado, Universidade do Porto, Vairão, PortugalFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAntónio M. Santos 2CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO, Laboratório Associado, Universidade do Porto, Vairão, PortugalFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAdelino V. M. Canário 1CCMAR Centre of Marine Sciences, University of Algarve, Campus de Gambelas, 8005-139 Faro, PortugalFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Adelino V. M. CanárioFor correspondence: acanario@ualg.pt

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz059 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101710 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101711

    1. Now published in GigaScience doi: 10.1093/gigascience/giz061 Martijn R. Molenaar §Department of Biochemistry and Cell Biology, Faculty of Veterinary Medicine, Utrecht University, 3584 CM, Utrecht, The NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Martijn R. MolenaarAike Jeucken §Department of Biochemistry and Cell Biology, Faculty of Veterinary Medicine, Utrecht University, 3584 CM, Utrecht, The NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteTsjerk A. Wassenaar ‡Groningen Biomolecular Sciences and Biotechnology Institute and Zernike Institute for Advanced Materials, University of Groningen, Nijenborgh 7, 9747 AG Groningen, The NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteChris H. A. van de Lest §Department of Biochemistry and Cell Biology, Faculty of Veterinary Medicine, Utrecht University, 3584 CM, Utrecht, The NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJos F. Brouwers §Department of Biochemistry and Cell Biology, Faculty of Veterinary Medicine, Utrecht University, 3584 CM, Utrecht, The NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJ. Bernd Helms §Department of Biochemistry and Cell Biology, Faculty of Veterinary Medicine, Utrecht University, 3584 CM, Utrecht, The NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: J.B.Helms@uu.nl

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz061 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101733 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101734 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101735

    1. Now published in GigaScience doi: 10.1093/gigascience/giz030 Matthew A. Conte 1Department of Biology, University of Maryland, College Park, MD 20742, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteRajesh Joshi 2Centre for Integrative Genetics (CIGENE), Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Norwegian University of Life Sciences, Ås, NorwayFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteEmily C. Moore 3Department of Biological Sciences and W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteSri Pratima Nandamuri 1Department of Biology, University of Maryland, College Park, MD 20742, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteWilliam J. Gammerdinger 1Department of Biology, University of Maryland, College Park, MD 20742, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteReade B. Roberts 3Department of Biological Sciences and W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteKaren L. Carleton 1Department of Biology, University of Maryland, College Park, MD 20742, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteSigbjørn Lien 2Centre for Integrative Genetics (CIGENE), Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Norwegian University of Life Sciences, Ås, NorwayFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteThomas D. Kocher 1Department of Biology, University of Maryland, College Park, MD 20742, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz030 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101604 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101605

    1. Now published in GigaScience doi: 10.1093/gigascience/giz026 Justin Jiang 1Department of Biology, Harvey Mudd College, 1250 N. Dartmouth Ave, Claremont, CA 91711, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Justin JiangAndrea M. Quattrini 1Department of Biology, Harvey Mudd College, 1250 N. Dartmouth Ave, Claremont, CA 91711, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Andrea M. QuattriniWarren R. Francis 2University of Southern Denmark, Dept. of Biology, Campusvej 55, Odense M 5230, DenmarkFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Warren R. FrancisJoseph F. Ryan 3Whitney Laboratory for Marine Bioscience, University of Florida, 9505 Ocean Shore Blvd. St. Augustine, FL 32080, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteEstefanía Rodríguez 4Division of Invertebrate Zoology, American Museum of Natural History, Central Park West at 79th Street, New York, NY 10024, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Estefanía RodríguezCatherine S. McFadden 1Department of Biology, Harvey Mudd College, 1250 N. Dartmouth Ave, Claremont, CA 91711, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Catherine S. McFadden

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz026 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101602 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101603

    1. Now published in GigaScience doi: 10.1093/gigascience/giz033 Wenbo Chen 1Boyce Thompson Institute, 533 Tower Road, Ithaca NY 14853Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Wenbo ChenSara Shakir 1Boyce Thompson Institute, 533 Tower Road, Ithaca NY 148533Plant Genetics Lab, Gembloux Agro-Bio Tech, University of Liège, Gembloux, BelgiumFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Sara ShakirMahdiyeh Bigham 1Boyce Thompson Institute, 533 Tower Road, Ithaca NY 14853Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Mahdiyeh BighamZhangjun Fei 1Boyce Thompson Institute, 533 Tower Road, Ithaca NY 148532US Department of Agriculture-Agricultural Research Service, Robert W. Holley Center for Agriculture and Health, Ithaca, NY, 14853, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Zhangjun FeiGeorg Jander 1Boyce Thompson Institute, 533 Tower Road, Ithaca NY 14853Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Georg JanderFor correspondence: gj32@cornell.edu

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz033 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101619 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101620 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101621

    1. Now published in GigaScience doi: 10.1093/gigascience/giz024 Ermin Hodzic 1Laboratory for Advanced Genome Analysis, Vancouver Prostate Centre, Vancouver, BC, Canada2School of Computing Science, Simon Fraser University, Burnaby, BC, CanadaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteRaunak Shrestha 1Laboratory for Advanced Genome Analysis, Vancouver Prostate Centre, Vancouver, BC, Canada3Department of Urologic Sciences, University of British Columbia, Vancouver, BC, CanadaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Raunak ShresthaKaiyuan Zhu 4Department of Computer Science, Indiana University, Bloomington, IN, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteKuoyuan Cheng 5Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteColin C. Collins 1Laboratory for Advanced Genome Analysis, Vancouver Prostate Centre, Vancouver, BC, Canada3Department of Urologic Sciences, University of British Columbia, Vancouver, BC, CanadaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteS. Cenk Sahinalp 1Laboratory for Advanced Genome Analysis, Vancouver Prostate Centre, Vancouver, BC, Canada4Department of Computer Science, Indiana University, Bloomington, IN, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for S. Cenk SahinalpFor correspondence: cenksahi@indiana.edu

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz024 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101626 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101627

    1. Now published in GigaScience doi: 10.1093/gigascience/giz040 Brent S. Pedersen 1Department of Human Genetics, University of Utah. Salt Lake City, UT3USTAR Center for Genetic Discovery, University of Utah. Salt Lake City, UTFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Brent S. PedersenAaron R. Quinlan 1Department of Human Genetics, University of Utah. Salt Lake City, UT2Department of Biomedical Informatics, University of Utah. Salt Lake City, UT3USTAR Center for Genetic Discovery, University of Utah. Salt Lake City, UTFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Aaron R. Quinlan

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz040 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101641 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101642

    1. Now published in GigaScience doi: 10.1093/gigascience/giz052 Tazro Ohta 1Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Yata 1111, Mishima, Shizuoka 411-8540, JapanFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Tazro OhtaTomoya Tanjo 2National Institute of Informatics, Research Organization of Information and Systems, Tokyo 101–8430, JapanFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Tomoya TanjoOsamu Ogasawara 3DNA Data Bank of Japan, National Institute of Genetics, Research Organization of Information and Systems, Yata, Mishima 411-8540, JapanFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Osamu Ogasawara

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz052 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101638 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101639 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101640

    1. Now published in GigaScience doi: 10.1093/gigascience/giz045 Ren-Hua Chung 1Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, TaiwanFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: rchung@nhri.org.twChen-Yu Kang 1Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, TaiwanFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz045 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101652 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101653

    1. Now published in GigaScience doi: 10.1093/gigascience/giz044 Samuel Lampa 1Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden2Department of Biochemistry and Biophysics, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Stockholm University, Stockholm, SwedenFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Samuel LampaMartin Dahlö 1Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, SwedenFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Martin DahlöJonathan Alvarsson 1Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, SwedenFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Jonathan AlvarssonOla Spjuth 1Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, SwedenFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Ola Spjuth

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz044 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101656 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101657 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101659

    1. Now published in GigaScience doi: 10.1093/gigascience/giy163 Pirita Paajanen 1Earlham Institute, Norwich, UK;Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteGeorge Kettleborough 1Earlham Institute, Norwich, UK;Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for George KettleboroughElena López-Girona 2The James Hutton Institute, Invergowrie, Dundee, UK.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteMichael Giolai 1Earlham Institute, Norwich, UK;Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteDarren Heavens 1Earlham Institute, Norwich, UK;Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteDavid Baker 1Earlham Institute, Norwich, UK;Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteAshleigh Lister 1Earlham Institute, Norwich, UK;Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Ashleigh ListerGail Wilde 2The James Hutton Institute, Invergowrie, Dundee, UK.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteIngo Hein 2The James Hutton Institute, Invergowrie, Dundee, UK.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteIain Macaulay 1Earlham Institute, Norwich, UK;Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Iain MacaulayGlenn J. Bryan 2The James Hutton Institute, Invergowrie, Dundee, UK.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteMatthew D. Clark 1Earlham Institute, Norwich, UK;Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Matthew D. ClarkFor correspondence: matt.clark@earlham.ac.uk

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy163 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101497 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101498

    1. Abstract

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz003 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101502 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101503 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101504

    1. Now published in GigaScience doi: 10.1093/gigascience/giz018 Laura-jayne Gardiner 1Earlham Institute, Norwich, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteThomas Brabbs 1Earlham Institute, Norwich, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAlina Akhunova 2Kansas State University, Department of Plant Pathology, Manhattan, KS, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteHikmet Budak 3Montana State University, Department of Plant Sciences and Plant Pathology, Bozeman, MT, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteTodd Richmond 4Roche Sequencing Solutions, Seattle, WA, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteSukwinder Singh 5CIMMYT, Obregon, MexicoFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteLeah Catchpole 1Earlham Institute, Norwich, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteEduard Akhunov 2Kansas State University, Department of Plant Pathology, Manhattan, KS, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAnthony Hall 1Earlham Institute, Norwich, UK6School of Biological Sciences, University of East Anglia, Norwich, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz018 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101535 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101536

    1. ABSTRACT

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz010 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101551 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101552

    1. A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz009 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101555

    1. Now published in GigaScience doi: 10.1093/gigascience/giy121

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy121 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101388 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101389

    1. Now published in GigaScience doi: 10.1093/gigascience/giy123 Ricardo Wurmus 1The Bioinformatics Platform, The Berlin Institute for Medical Systems Biology, Max-Delbrück Center for Molecular Medicine, Robert-Rössle-Strasse 10, 13125 Berlin, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteBora Uyar 1The Bioinformatics Platform, The Berlin Institute for Medical Systems Biology, Max-Delbrück Center for Molecular Medicine, Robert-Rössle-Strasse 10, 13125 Berlin, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteBrendan Osberg 1The Bioinformatics Platform, The Berlin Institute for Medical Systems Biology, Max-Delbrück Center for Molecular Medicine, Robert-Rössle-Strasse 10, 13125 Berlin, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteVedran Franke 1The Bioinformatics Platform, The Berlin Institute for Medical Systems Biology, Max-Delbrück Center for Molecular Medicine, Robert-Rössle-Strasse 10, 13125 Berlin, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAlexander Gosdschan 1The Bioinformatics Platform, The Berlin Institute for Medical Systems Biology, Max-Delbrück Center for Molecular Medicine, Robert-Rössle-Strasse 10, 13125 Berlin, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteKatarzyna Wreczycka 1The Bioinformatics Platform, The Berlin Institute for Medical Systems Biology, Max-Delbrück Center for Molecular Medicine, Robert-Rössle-Strasse 10, 13125 Berlin, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJonathan Ronen 1The Bioinformatics Platform, The Berlin Institute for Medical Systems Biology, Max-Delbrück Center for Molecular Medicine, Robert-Rössle-Strasse 10, 13125 Berlin, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAltuna Akalin 1The Bioinformatics Platform, The Berlin Institute for Medical Systems Biology, Max-Delbrück Center for Molecular Medicine, Robert-Rössle-Strasse 10, 13125 Berlin, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Altuna AkalinFor correspondence: altuna.akalin@mdc-berlin.de

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy123 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101393 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101392

    1. Abstract

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy124 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101485 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101486 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101487

    1. Abstract

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy128 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101408 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101409

    1. A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy133 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101411 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101412

    1. Now published in GigaScience doi: 10.1093/gigascience/giy127 Xiaofeng Dong 1Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom2Department of Biological Sciences, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China3School of Life Sciences, Jiangsu Normal University, Xuzhou 221116, China4Institute of Infection & Global Health, University of Liverpool, L3 5RF, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteKittipong Chaisiri 4Institute of Infection & Global Health, University of Liverpool, L3 5RF, United Kingdom5Faculty of Tropical Medicine, Mahidol University, Ratchathewi Bangkok 10400, ThailandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteDong Xia 4Institute of Infection & Global Health, University of Liverpool, L3 5RF, United Kingdom6The Royal Veterinary College, London NW1 0TU, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Dong XiaStuart D. Armstrong 4Institute of Infection & Global Health, University of Liverpool, L3 5RF, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteYongxiang Fang 1Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Yongxiang FangMartin J. Donnelly 7Department of Vector Biology, Liverpool School of Tropical Medicine, Liverpool L3 5QA, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Martin J. DonnellyTatsuhiko Kadowaki 2Department of Biological Sciences, Xi’an Jiaotong-Liverpool University, Suzhou 215123, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Tatsuhiko KadowakiJohn W. McGarry 8Institute of Veterinary Science, University of Liverpool, Liverpool L3 5RP, United Kingdom.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteAlistair C. Darby 1Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Alistair C. DarbyBenjamin L. Makepeace 4Institute of Infection & Global Health, University of Liverpool, L3 5RF, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Benjamin L. MakepeaceFor correspondence: blm1@liv.ac.uk

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy127 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101420 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101421

    1. A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy140 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101432 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101429 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101431

    1. Now published in GigaScience doi: 10.1093/gigascience/giy142 Giulio Formenti 1Department of Environmental Science and Policy, University of Milan (Milan, Italy) Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Giulio FormentiFor correspondence: giulio.formenti@unimi.itMatteo Chiara 2Department of Biosciences, University of Milan (Milan, Italy) Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Matteo ChiaraFor correspondence: matteo.chiara@unimi.itLucy Poveda 3Functional Genomics Center of Zurich, University of Zurich, (Zurich, Switzerland) Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Lucy PovedaFor correspondence: lucy.poveda@fgcz.uzh.chKees-Jan Francoijs 4Bionano Genomics (San Diego, CA, USA) Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Kees-Jan FrancoijsFor correspondence: kfrancoijs@bionanogenomics.comAndrea Bonisoli-Alquati 5Department of Biological Sciences, California State Polytechnic University, Pomona (Pomona, CA, USA) Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Andrea Bonisoli-AlquatiFor correspondence: kfrancoijs@bionanogenomics.comLuca Canova 6Department of Biochemistry, University of Pavia (Pavia, Italy) Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Luca CanovaFor correspondence: canova@unipv.itLuca Gianfranceschi 7Department of Biosciences, University of Milan (Milan, Italy) Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Luca GianfranceschiFor correspondence: luca.gianfranceschi@unimi.itDavid Stephen Horner 8Department of Biosciences, University of Milan (Milan, Italy) Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for David Stephen HornerFor correspondence: david.horner@unimi.itNicola Saino 9Department of Environmental Science and Policy, University of Milan (Milan, Italy) Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Nicola SainoFor correspondence: nicola.saino@unimi.it

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy142 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101444 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101445

    1. Now published in GigaScience doi: 10.1093/gigascience/giy137 Y. A. Tsepilov 1Institute of Cytology and Genetics SB RAS, 630090 Novosibirsk, RussiaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteS. Zh. Sharapov 1Institute of Cytology and Genetics SB RAS, 630090 Novosibirsk, Russia2Novosibirsk State University, 630090 Novosibirsk, RussiaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteO. O. Zaytseva 1Institute of Cytology and Genetics SB RAS, 630090 Novosibirsk, Russia2Novosibirsk State University, 630090 Novosibirsk, RussiaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJ. Krumsek 3Institute of Computational Biology, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, Germany.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteC. Prehn 4Institute of Experimental Genetics, Genome Analysis Center, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, Germany.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteJ. Adamski 4Institute of Experimental Genetics, Genome Analysis Center, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, Germany.5Institute of Experimental Genetics, Life and Food Science Center Weihenstephan, Technische Universität München, 85354 Freising-Weihenstephan, Germany6German Center for Diabetes Research, 85764 Neuherberg, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteG. Kastenmüller 7Institute of Bioinformatics and Systems Biology, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteR. Wang-Sattler 6German Center for Diabetes Research, 85764 Neuherberg, Germany8Research Unit of Molecular Epidemiology, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, Germany9Institute of Epidemiology II, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteK. Strauch 10Institute of Genetic Epidemiology, Helmholtz Zentrum München - German Research Center for Environmental Health, 85764 Neuherberg, Germany11Institute of Medical Informatics, Biometry and Epidemiology, Chair of Genetic Epidemiology, Ludwig-Maximilians-Universität, 80539, Munich, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteC. Gieger 6German Center for Diabetes Research, 85764 Neuherberg, Germany8Research Unit of Molecular Epidemiology, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, Germany9Institute of Epidemiology II, Helmholtz Zentrum München-German Research Center for Environmental Health, 85764 Neuherberg, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteY. S. Aulchenko 1Institute of Cytology and Genetics SB RAS, 630090 Novosibirsk, Russia2Novosibirsk State University, 630090 Novosibirsk, RussiaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: yurii@bionet.nsc.ru

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy137 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101442 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101443

    1. Now published in GigaScience doi: 10.1093/gigascience/giy147 Angelica da Silva Lantyer Department of Neurophysiology, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen - the NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteNiccolò Calcini Department of Neurophysiology, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen - the NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAte Bijlsma Department of Neurophysiology, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen - the NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteKoen Kole Department of Neurophysiology, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen - the NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMelanie Emmelkamp Department of Neurophysiology, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen - the NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteManon Peeters Department of Neurophysiology, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen - the NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteWim J. J. Scheenen Department of Neurophysiology, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen - the NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFleur Zeldenrust Department of Neurophysiology, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen - the NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteTansu Celikel Department of Neurophysiology, Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen - the NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy147 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101446 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101447 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101448

    1. Now published in GigaScience doi: 10.1093/gigascience/giy148 Chris-Andre Leimeister 1University of Göttingen, Department of Bioinformatics, Goldschmidtstr. 1, 37077 Göttingen, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJendrik Schellhorn 1University of Göttingen, Department of Bioinformatics, Goldschmidtstr. 1, 37077 Göttingen, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteSvenja Schöbel 1University of Göttingen, Department of Bioinformatics, Goldschmidtstr. 1, 37077 Göttingen, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMichael Gerth 2Institute for Integrative Biology, University of Liverpool, Biosciences Building, Crown Street, L69 7ZB Liverpool, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteChristoph Bleidorn 3University of Göttingen, Department of Animal Evolution and Biodiversity, Untere Karspüle 2, 37073 Göttingen, Germany4Museo Nacional de Ciencias Naturales, Spanish National Research Council (CSIC), 28006 Madrid, SpainFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteBurkhard Morgenstern 1University of Göttingen, Department of Bioinformatics, Goldschmidtstr. 1, 37077 Göttingen, Germany5Göttingen Center of Molecular Biosciences (GZMB), Justus-von-Liebig-Weg 11, 37077 GöttingenFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy148 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101560 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101561 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101562 Reviewer 4: http://dx.doi.org/10.5524/REVIEW.101563 Reviewer 5: http://dx.doi.org/10.5524/REVIEW.101564

    1. Now published in GigaScience doi: 10.1093/gigascience/giy149 Kristian Peters 1Leibniz Institute of Plant Biochemistry, Stress and Developmental Biology, Weinberg 3, 06120 Halle (Saale), GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Kristian PetersJames Bradbury 2School of Biosciences, University of Birmingham, Edgbaston, Birmingham, B15 2TT, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteSven Bergmann 3Department of Computational Biology, University of Lausanne, Lausanne, Switzerland4Swiss Institute of Bioinformatics, Lausanne, SwitzerlandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMarco Capuccini 5Division of Scientific Computing, Department of Information Technology, Uppsala University, Sweden6Department of Pharmaceutical Biosciences, Uppsala University, Box 591, 751 24 Uppsala, SwedenFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMarta Cascante 7Department of Biochemistry and Molecular Biomedicine, Universitat de Barcelona; Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y DigestivasFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Marta CascantePedro de Atauri 8Department of Biochemistry and Molecular Biomedicine, Universitat de Barcelona; Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Instituto de Salud Carlos III (ISCIII), SpainFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Pedro de AtauriTimothy M D Ebbels 9Department of Surgery & Cancer, Imperial College London, South Kensington, London, SW7 2AZ, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteCarles Foguet 8Department of Biochemistry and Molecular Biomedicine, Universitat de Barcelona; Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Instituto de Salud Carlos III (ISCIII), SpainFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Carles FoguetRobert Glen 9Department of Surgery & Cancer, Imperial College London, South Kensington, London, SW7 2AZ, United Kingdom10Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB21EW, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAlejandra Gonzalez-Beltran 11Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX1 3QG, Oxford, UK.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Alejandra Gonzalez-BeltranUlrich Guenther 22College of Medical and Dental Sciences, University of Birmingham, Edgbaston, Birmingham, B15 2TT, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteEvangelos Handakas 9Department of Surgery & Cancer, Imperial College London, South Kensington, London, SW7 2AZ, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteThomas Hankemeier 12Netherlands Metabolomics Center, Leiden, 2333 CC, Netherlands13Division of Systems Biomedicine and Pharmacology, Leiden Academic Centre for DrugFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Thomas HankemeierKenneth Haug 14European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Kenneth HaugStephanie Herman 15Department of Medical Sciences, Clinical Chemistry, Uppsala University, 751 85 Uppsala, SwedenFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Stephanie HermanPetr Holub 29BBMRI-ERIC, Graz, AustriaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Petr HolubMassimiliano Izzo 11Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX1 3QG, Oxford, UK.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Massimiliano IzzoDaniel Jacob 16INRA, University of Bordeaux, Plateforme Métabolome Bordeaux-MetaboHUB, 33140 Villenave d’Ornon, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteDavid Johnson 11Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX1 3QG, Oxford, UK.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for David JohnsonFabien Jourdan 17INRA - French National Institute for Agricultural Research, UMR1331, Toxalim, Research Centre in Food Toxicology, Toulouse, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteNamrata Kale 14European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Namrata KaleIbrahim Karaman 18Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, St. Mary’s Campus, Norfolk Place, W2 1PG, London, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Ibrahim KaramanBita Khalili 3Department of Computational Biology, University of Lausanne, Lausanne, Switzerland4Swiss Institute of Bioinformatics, Lausanne, SwitzerlandFind this author on Google ScholarFind this author on PubMedSearch for this author on this sitePayam Emami Khonsari 15Department of Medical Sciences, Clinical Chemistry, Uppsala University, 751 85 Uppsala, SwedenFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Payam Emami KhonsariKim Kultima 15Department of Medical Sciences, Clinical Chemistry, Uppsala University, 751 85 Uppsala, SwedenFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Kim KultimaSamuel Lampa 6Department of Pharmaceutical Biosciences, Uppsala University, Box 591, 751 24 Uppsala, SwedenFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Samuel LampaAnders Larsson 19National Bioinformatics Infrastructure Sweden, Uppsala University, Uppsala, Sweden Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, SwedenFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Anders LarssonChristian Ludwig 22College of Medical and Dental Sciences, University of Birmingham, Edgbaston, Birmingham, B15 2TT, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this sitePablo Moreno 14European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Pablo MorenoSteffen Neumann 1Leibniz Institute of Plant Biochemistry, Stress and Developmental Biology, Weinberg 3, 06120 Halle (Saale), Germany20German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Steffen NeumannJon Ander Novella 19National Bioinformatics Infrastructure Sweden, Uppsala University, Uppsala, Sweden Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, SwedenFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteClaire O’Donovan 14European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Claire O’DonovanJake TM Pearce 9Department of Surgery & Cancer, Imperial College London, South Kensington, London, SW7 2AZ, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Jake TM PearceAlina Peluso 9Department of Surgery & Cancer, Imperial College London, South Kensington, London, SW7 2AZ, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Alina PelusoLuca Pireddu 21Distributed Computing Group, CRS4, Pula, ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Luca PiredduMichelle AC Reed 22College of Medical and Dental Sciences, University of Birmingham, Edgbaston, Birmingham, B15 2TT, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Michelle AC ReedPhilippe Rocca-Serra 11Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX1 3QG, Oxford, UK.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Philippe Rocca-SerraPierrick Roger 23CEA, LIST, Laboratory for Data Analysis and Systems’ Intelligence, MetaboHUB, Gif-Sur-Yvette F-91191, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAntonio Rosato 24Magnetic Resonance Center (CERM) and Department of Chemistry, University of Florence and CIRMMP, 50019 Sesto Fiorentino, Florence, ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Antonio RosatoRico Rueedi 3Department of Computational Biology, University of Lausanne, Lausanne, Switzerland4Swiss Institute of Bioinformatics, Lausanne, SwitzerlandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Rico RueediChristoph Ruttkies 1Leibniz Institute of Plant Biochemistry, Stress and Developmental Biology, Weinberg 3, 06120 Halle (Saale), GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Christoph RuttkiesNoureddin Sadawi 9Department of Surgery & Cancer, Imperial College London, South Kensington, London, SW7 2AZ, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Noureddin SadawiReza M Salek 25European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, U.KFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Reza M SalekSusanna-Assunta Sansone 11Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX1 3QG, Oxford, UK.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Susanna-Assunta SansoneVitaly Selivanov 8Department of Biochemistry and Molecular Biomedicine, Universitat de Barcelona; Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Instituto de Salud Carlos III (ISCIII), SpainFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Vitaly SelivanovOla Spjuth 6Department of Pharmaceutical Biosciences, Uppsala University, Box 591, 751 24 Uppsala, SwedenFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Ola SpjuthDaniel Schober 1Leibniz Institute of Plant Biochemistry, Stress and Developmental Biology, Weinberg 3, 06120 Halle (Saale), GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Daniel SchoberEtienne A. Thévenot 23CEA, LIST, Laboratory for Data Analysis and Systems’ Intelligence, MetaboHUB, Gif-Sur-Yvette F-91191, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Etienne A. ThévenotMattia Tomasoni 3Department of Computational Biology, University of Lausanne, Lausanne, Switzerland4Swiss Institute of Bioinformatics, Lausanne, SwitzerlandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMerlijn van Rijswijk 26ELIXIR-NL, Dutch Techcentre for Life Sciences, Utrecht, 3503 RM, Netherlands27Netherlands Metabolomics Center, Leiden, 2333 CC, The NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Merlijn van RijswijkMichael van Vliet 28Division of Systems Biomedicine and Pharmacology, Leiden Academic Centre for Drug Research (LACDR), Leiden University, Leiden, 2333 CC, The NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Michael van VlietMark R Viant 2School of Biosciences, University of Birmingham, Edgbaston, Birmingham, B15 2TT, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Mark R ViantRalf J. M. Weber 2School of Biosciences, University of Birmingham, Edgbaston, Birmingham, B15 2TT, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteGianluigi Zanetti 21Distributed Computing Group, CRS4, Pula, ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Gianluigi ZanettiChristoph Steinbeck 30Cheminformatics and Computational Metabolomics, Institute for Analytical Chemistry, Lessingstr. 8, 07743 Jena, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Christoph Steinbeck

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy149 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101467 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101468

    1. Now published in GigaScience doi: 10.1093/gigascience/giy136 Fadhl M. Al-Akwaa 2Molecular Biology and Bioengineering Graduate Program, University of Hawaii at Monoa, Honolulu, HI, USA 96822Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Fadhl M. Al-AkwaaSijia Huang 1University of Hawaii Cancer Center, Department of Epidemiology, 701 Ilalo Street, Honolulu, HI USA 968132Molecular Biology and Bioengineering Graduate Program, University of Hawaii at Monoa, Honolulu, HI, USA 96822Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteLana X. Garmire 1University of Hawaii Cancer Center, Department of Epidemiology, 701 Ilalo Street, Honolulu, HI USA 968132Molecular Biology and Bioengineering Graduate Program, University of Hawaii at Monoa, Honolulu, HI, USA 96822Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: Lgarmire@cc.hawaii.edu

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy136 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101473 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101474 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101475

    1. ABSTRACT

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy131 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101417 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101418

    1. Now published in GigaScience doi: 10.1093/gigascience/giy158 Lisa K. Johnson 1Department of Population Health & Reproduction, School of Veterinary Medicine, University of California Davis2Molecular, Cellular, and Integrative Physiology Graduate Group, University of California DavisFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Lisa K. JohnsonHarriet Alexander 1Department of Population Health & Reproduction, School of Veterinary Medicine, University of California DavisFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Harriet AlexanderC. Titus Brown 1Department of Population Health & Reproduction, School of Veterinary Medicine, University of California Davis2Molecular, Cellular, and Integrative Physiology Graduate Group, University of California Davis3Genome Center, University of California DavisFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for C. Titus Brown

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy158 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101581 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101582 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101583 Reviewer 4: http://dx.doi.org/10.5524/REVIEW.101584

    1. Now published in GigaScience doi: 10.1093/gigascience/giy112 Simon P Sadedin 1Bioinformatics, Murdoch Childrens Research Institute, Royal Children's Hospital Flemington Road, Parkville, Victoria 3052 AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJustine A Ellis 2Genes Environment & Complex Disease, Murdoch Childrens Research Institute, Royal Children’s Hospital Flemington Road, Parkville, Victoria 3052 Australia3Department of Paediatrics, University of Melbourne, Victoria 3010 Australia4Centre for Social and Early Emotional Development, Faculty of Health, Deakin University, Burwood, Victoria 3125 AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Justine A EllisSeth L Masters 5Inflamation Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Victoria 3052, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAlicia Oshlack 1Bioinformatics, Murdoch Childrens Research Institute, Royal Children's Hospital Flemington Road, Parkville, Victoria 3052 Australia6Department of BioScience, University of Melbourne, Parkville 3050, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy112 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101340 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101341 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101342 Reviewer 4: http://dx.doi.org/10.5524/REVIEW.101343 Reviewer 5: http://dx.doi.org/10.5524/REVIEW.101345

    1. Now published in GigaScience doi: 10.1093/gigascience/giy117 Zhouchun Shang 1Shanghai Tenth People’s Hospital, Tongji University School of Medicine, Shanghai, China2BGI-Shenzhen, Shenzhen, China3China National GeneBank, BGI-Shenzhen, Shenzhen, China4Shenzhen Engineering Laboratory for Innovative Molecular Diagnostics, BGI-Shenzhen, Shenzhen, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteDongsheng Chen 2BGI-Shenzhen, Shenzhen, China3China National GeneBank, BGI-Shenzhen, Shenzhen, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteQuanlei Wang 2BGI-Shenzhen, Shenzhen, China3China National GeneBank, BGI-Shenzhen, Shenzhen, China4Shenzhen Engineering Laboratory for Innovative Molecular Diagnostics, BGI-Shenzhen, Shenzhen, China5BGI Education Center, University of Chinese Academy of Sciences, Shenzhen, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteShengpeng Wang 2BGI-Shenzhen, Shenzhen, China3China National GeneBank, BGI-Shenzhen, Shenzhen, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteQiuting Deng 2BGI-Shenzhen, Shenzhen, China3China National GeneBank, BGI-Shenzhen, Shenzhen, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteLiang Wu 2BGI-Shenzhen, Shenzhen, China3China National GeneBank, BGI-Shenzhen, Shenzhen, China5BGI Education Center, University of Chinese Academy of Sciences, Shenzhen, China6Shenzhen Key Laboratory of Neurogenomics, BGI-Shenzhen, Shenzhen, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteXiangning Ding 2BGI-Shenzhen, Shenzhen, China3China National GeneBank, BGI-Shenzhen, Shenzhen, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteShiyou Wang 2BGI-Shenzhen, Shenzhen, China3China National GeneBank, BGI-Shenzhen, Shenzhen, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJixing Zhong 2BGI-Shenzhen, Shenzhen, China3China National GeneBank, BGI-Shenzhen, Shenzhen, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteDoudou Zhang 7Department of Neurosurgery, Shenzhen Second People’s Hospital, Shenzhen University 1st Affiliated Hospital, Shenzhen, Guangdong, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteXiaodong Cai 7Department of Neurosurgery, Shenzhen Second People’s Hospital, Shenzhen University 1st Affiliated Hospital, Shenzhen, Guangdong, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteShida Zhu 2BGI-Shenzhen, Shenzhen, China3China National GeneBank, BGI-Shenzhen, Shenzhen, China4Shenzhen Engineering Laboratory for Innovative Molecular Diagnostics, BGI-Shenzhen, Shenzhen, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteHuanming Yang 2BGI-Shenzhen, Shenzhen, China8James D. Watson Institute of Genome Sciences, Hangzhou, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteLongqi Liu 2BGI-Shenzhen, Shenzhen, China3China National GeneBank, BGI-Shenzhen, Shenzhen, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJ. Lynn Fink 2BGI-Shenzhen, Shenzhen, China9The University of Queensland, Diamantina Institute (UQDI), Brisbane, QLD, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFang Chen 2BGI-Shenzhen, Shenzhen, China3China National GeneBank, BGI-Shenzhen, Shenzhen, China10Laboratory of Genomics and Molecular Biomedicine, Department of Biology, University of Copenhagen, DK-2100, Copenhagen, DenmarkFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteXiaoqing Liu 1Shanghai Tenth People’s Hospital, Tongji University School of Medicine, Shanghai, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteZhengliang Gao 1Shanghai Tenth People’s Hospital, Tongji University School of Medicine, Shanghai, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: xuxun@genomics.cnXun Xu 2BGI-Shenzhen, Shenzhen, China3China National GeneBank, BGI-Shenzhen, Shenzhen, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: xuxun@genomics.cn

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy117 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101371 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101372

    1. Now published in GigaScience doi: 10.1093/gigascience/giy103 Matthew Jacobson 1Michael Smith Laboratories, University of British Columbia2Department of Psychiatry, University of British ColumbiaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAdriana Estela Sedeño-Cortés 3Graduate Program in Bioinformatics, University of British ColumbiaFind this author on Google ScholarFind this author on PubMedSearch for this author on this sitePaul Pavlidis 1Michael Smith Laboratories, University of British Columbia2Department of Psychiatry, University of British ColumbiaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Paul PavlidisFor correspondence: paul@msl.ubc.ca

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy103 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101302 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101303

    1. Now published in GigaScience doi: 10.1093/gigascience/giy126 Jakub Pospíšil 1Department of Radioelectronics, Faculty of Electrical Engineering, Czech Technical University in Prague, Technická 2, 16627 Prague 6, Czech RepublicFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteTomáš Lukeš 1Department of Radioelectronics, Faculty of Electrical Engineering, Czech Technical University in Prague, Technická 2, 16627 Prague 6, Czech Republic2Laboratory of Nanoscale Biology, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, SwitzerlandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJustin Bendesky 3UCCS center for the Biofrontiers Institute, University of Colorado at Colorado Springs, 1420 Austin Bluffs Parkway, Colorado Springs, Colorado, 80918, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteKarel Fliegel 1Department of Radioelectronics, Faculty of Electrical Engineering, Czech Technical University in Prague, Technická 2, 16627 Prague 6, Czech RepublicFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteKathrin Spendier 3UCCS center for the Biofrontiers Institute, University of Colorado at Colorado Springs, 1420 Austin Bluffs Parkway, Colorado Springs, Colorado, 80918, USA4Department of Physics and Energy Science, University of Colorado at Colorado Springs, 1420 Austin Bluffs Parkway, Colorado Springs, Colorado, 80918, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteGuy M. Hagen 3UCCS center for the Biofrontiers Institute, University of Colorado at Colorado Springs, 1420 Austin Bluffs Parkway, Colorado Springs, Colorado, 80918, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Guy M. Hagen

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy126 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101401 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101402

    1. Now published in GigaScience doi: 10.1093/gigascience/giy090 Jie Zheng 1MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Bristol, UK;Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: jie.zheng@bristol.ac.uk tom.gaunt@bristol.ac.ukTom G. Richardson 1MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Bristol, UK;Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteLouise A. C. Millard 1MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Bristol, UK;2Intelligent Systems Laboratory, University of Bristol, Bristol, UK;Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteGibran Hemani 1MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Bristol, UK;Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteChristopher Raistrick Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteBjarni Vilhjalmsson 3Århus Center for Bioinformatics BIRC, Aarhus UniversityFind this author on Google ScholarFind this author on PubMedSearch for this author on this sitePhilip Haycock 1MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Bristol, UK;Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteTom R Gaunt 1MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Bristol, UK;Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: jie.zheng@bristol.ac.uk tom.gaunt@bristol.ac.uk

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy090 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101321 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101322

    1. Now published in GigaScience doi: 10.1093/gigascience/giy096 Sung-Huan Yu 1Institute of Molecular Infection Biology (IMIB), University of Würzburg, 97080 Würzburg, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Sung-Huan YuJörg Vogel 1Institute of Molecular Infection Biology (IMIB), University of Würzburg, 97080 Würzburg, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Jörg VogelKonrad U. Förstner 1Institute of Molecular Infection Biology (IMIB), University of Würzburg, 97080 Würzburg, Germany2Core Unit Systems Medicine, University of Würzburg, 97080 Würzburg, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Konrad U. FörstnerFor correspondence: konrad.foerstner@uni-wuerzburg.de

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy096 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101337 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101338 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101339

    1. Now published in GigaScience doi: 10.1093/gigascience/giy079 Breon M Schmidt 1Murdoch Children’s Research Institute, Royal Children’s Hospital, Flemington Road, Parkville, Vic, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteNadia M Davidson 1Murdoch Children’s Research Institute, Royal Children’s Hospital, Flemington Road, Parkville, Vic, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAnthony DK Hawkins 1Murdoch Children’s Research Institute, Royal Children’s Hospital, Flemington Road, Parkville, Vic, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteRay Bartolo 1Murdoch Children’s Research Institute, Royal Children’s Hospital, Flemington Road, Parkville, Vic, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteIan J Majewski 2Walter and Eliza Hall Institute of Medical Research, Royal Parade, Vic 3050, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this sitePaul G Ekert 1Murdoch Children’s Research Institute, Royal Children’s Hospital, Flemington Road, Parkville, Vic, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAlicia Oshlack 1Murdoch Children’s Research Institute, Royal Children’s Hospital, Flemington Road, Parkville, Vic, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Alicia Oshlack

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy079 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101243 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101244 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101245

    1. Now published in GigaScience doi: 10.1093/gigascience/giy081 Li Charlie Xia 1Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA 943052Department of Statistics, the Wharton School, University of Pennsylvania, Philadelphia, PA 18014Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Li Charlie XiaDongmei Ai 3School of Mathematics and Physics, University of Science and Technology Beijing, 30 Xueyuan Road, Haidian District, Beijing 100083 P. R. ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteHojoon Lee 1Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteNoemi Andor 1Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteChao Li 3School of Mathematics and Physics, University of Science and Technology Beijing, 30 Xueyuan Road, Haidian District, Beijing 100083 P. R. ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteNancy R. Zhang 2Department of Statistics, the Wharton School, University of Pennsylvania, Philadelphia, PA 18014Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteHanlee P. Ji 1Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA 943054Stanford Genome Technology Center, Stanford University, Palo Alto, CA 94304Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: genomics_ji@stanford.edu

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy081 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101246 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101247 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101248

    1. Now published in GigaScience doi: 10.1093/gigascience/giy083 Luke Zappia 1Bioinformatics, Murdoch Children’s Research Institute2School of Biosciences, University of MelbourneFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Luke ZappiaAlicia Oshlack 1Bioinformatics, Murdoch Children’s Research Institute2School of Biosciences, University of MelbourneFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Alicia Oshlack

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy083 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101258 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101259 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101260

    1. Now published in GigaScience doi: 10.1093/gigascience/giy086 Jang-il Sohn 1Department of Life Science, Hanyang University, Seoul 133-791, Republic of Korea2Research Institute for Convergence of Basic Sciences, Hanyang University, Seoul 133-791, Republic of KoreaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteKyoungwoo Nam 1Department of Life Science, Hanyang University, Seoul 133-791, Republic of KoreaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteHyosun Hong 1Department of Life Science, Hanyang University, Seoul 133-791, Republic of KoreaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJun-Mo Kim 3Department of Animal Biotechnology & Environment, National Institute of Animal Science, RDA, Wanju 55365, Republic of KoreaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteDajeong Lim 3Department of Animal Biotechnology & Environment, National Institute of Animal Science, RDA, Wanju 55365, Republic of KoreaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteKyung-Tai Lee 3Department of Animal Biotechnology & Environment, National Institute of Animal Science, RDA, Wanju 55365, Republic of KoreaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteYoon Jung Do 3Department of Animal Biotechnology & Environment, National Institute of Animal Science, RDA, Wanju 55365, Republic of KoreaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteChang Yeon Cho 4Animal Genetic Resource Research Center, National Institute of Animal Science, RDA, Namwon 55717, Republic of KoreaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteNamshin Kim 5Personalized Genomic Medicine Research Center, KRIBB, Daejeon 34141, Republic of KoreaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteHan-Ha Chae 3Department of Animal Biotechnology & Environment, National Institute of Animal Science, RDA, Wanju 55365, Republic of KoreaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJin-Wu Nam 1Department of Life Science, Hanyang University, Seoul 133-791, Republic of Korea2Research Institute for Convergence of Basic Sciences, Hanyang University, Seoul 133-791, Republic of KoreaFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy086 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101252 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101253

    1. Now published in GigaScience doi: 10.1093/gigascience/giy093 Luca Venturini 1Earlham Institute, Norwich Research Park, NR4 7UZ Norwich, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Luca VenturiniShabhonam Caim 1Earlham Institute, Norwich Research Park, NR4 7UZ Norwich, United Kingdom2Institute of Food Research, Norwich Research Park, NR4 7UA Norwich, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Shabhonam CaimGemy G Kaithakottil 1Earlham Institute, Norwich Research Park, NR4 7UZ Norwich, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Gemy G KaithakottilDaniel L Mapleson 1Earlham Institute, Norwich Research Park, NR4 7UZ Norwich, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Daniel L MaplesonDavid Swarbreck 1Earlham Institute, Norwich Research Park, NR4 7UZ Norwich, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for David SwarbreckFor correspondence: david.swarbreck@earlham.ac.uk

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy093 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101278 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101279

    1. Now published in GigaScience doi: 10.1093/gigascience/giy092 Pai Zhang Institute of Technology, University of Washington, Tacoma, Washington 98402, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteLing-Hong Hung Institute of Technology, University of Washington, Tacoma, Washington 98402, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteWes Lloyd Institute of Technology, University of Washington, Tacoma, Washington 98402, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteKa Yee Yeung Institute of Technology, University of Washington, Tacoma, Washington 98402, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy092 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101280 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101281 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101282

    1. Now published in GigaScience doi: 10.1093/gigascience/giy069 Fernando Meyer 1Department of Computational Biology of Infection Research, Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany2Braunschweig Integrated Centre of Systems Biology (BRICS), Braunschweig, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this sitePeter Hofmann 1Department of Computational Biology of Infection Research, Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany2Braunschweig Integrated Centre of Systems Biology (BRICS), Braunschweig, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this sitePeter Belmann 1Department of Computational Biology of Infection Research, Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany2Braunschweig Integrated Centre of Systems Biology (BRICS), Braunschweig, Germany3Faculty of Technology, Bielefeld University, Bielefeld, Germany4Center for Biotechnology, Bielefeld University, Bielefeld, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteRuben Garrido-Oter 5Department of Plant Microbe Interactions, Max Planck Institute for Plant Breeding Research, Cologne, Germany6Cluster of Excellence on Plant Sciences (CEPLAS)Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteAdrian Fritz 1Department of Computational Biology of Infection Research, Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany2Braunschweig Integrated Centre of Systems Biology (BRICS), Braunschweig, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAlexander Sczyrba 3Faculty of Technology, Bielefeld University, Bielefeld, Germany4Center for Biotechnology, Bielefeld University, Bielefeld, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAlice C. McHardy 1Department of Computational Biology of Infection Research, Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany2Braunschweig Integrated Centre of Systems Biology (BRICS), Braunschweig, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: Alice.McHardy@helmholtz-hzi.de

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy069 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101202 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101203

    1. Now published in GigaScience doi: 10.1093/gigascience/giy070 LM Simon 1Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for LM SimonAJ Westermann 2Institute for Molecular Infection Biology, University Würzburg, Würzburg, Germany3Helmholtz Institute for RNA-Based Infection Research (HIRI), Würzburg, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteM Engel 1Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany4Helmholtz Zentrum München, German Research Center for Environmental Health, Scientific Computing Research Unit, Neuherberg, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAHA Elbehery 5Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Virology, Neuherberg, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteB Hense 1Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteM Heinig 1Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for M HeinigL Deng 5Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Virology, Neuherberg, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFJ Theis 1Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany6Department of Mathematics, Technische Universität München, Munich, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for FJ Theis

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy070 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101204 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101205

    1. Now published in GigaScience doi: 10.1093/gigascience/giy072 Alessia Visconti 1Department of Twin Research and Genetic Epidemiology, King’s College LondonFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Alessia ViscontiFor correspondence: alessia.visconti@kcl.ac.ukTiphaine C. Martin 1Department of Twin Research and Genetic Epidemiology, King’s College LondonFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMario Falchi 1Department of Twin Research and Genetic Epidemiology, King’s College LondonFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy072 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101208 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101209

    1. Now published in GigaScience doi: 10.1093/gigascience/giy073 Carlos A. Manacorda 1Instituto de Biotecnología, CICVyA, INTA, ArgentinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteSebastian Asurmendi 1Instituto de Biotecnología, CICVyA, INTA, Argentina2CONICET, ArgentinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy073 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101211 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101212 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101213

    1. Now published in GigaScience doi: 10.1093/gigascience/giy077 Yang-Min Kim 1Institut Pasteur, Human Genetics and Cognitive Functions Unit, Paris, France,2CNRS UMR 3571 Genes, Synapses and Cognition, Institut Pasteur, Paris, France,3University Paris Diderot, Sorbonne Paris Cité, Paris, France,4Centre de Bioinformatique, Biostatistique et Biologie Intégrative (C3BI, USR 3756 Institut Pasteur and CNRS), Paris, France,Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Yang-Min KimJean-Baptiste Poline 5Henry H. Wheeler Jr. Brain Imaging Center, Helen Wills Neuroscience Institute, University of California, Berkeley, California, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Jean-Baptiste PolineGuillaume Dumas 1Institut Pasteur, Human Genetics and Cognitive Functions Unit, Paris, France,2CNRS UMR 3571 Genes, Synapses and Cognition, Institut Pasteur, Paris, France,3University Paris Diderot, Sorbonne Paris Cité, Paris, France,4Centre de Bioinformatique, Biostatistique et Biologie Intégrative (C3BI, USR 3756 Institut Pasteur and CNRS), Paris, France,Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Guillaume Dumas

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy077 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101237 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101238

    1. Now published in GigaScience doi: 10.1093/gigascience/giy045 Nadia M Davidson 1Murdoch Childrens Research Institute, Royal Children’s Hospital, Victoria, Australia2School of Bio-Sciences, University of Melbourne, Victoria, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: nadia.davidson@mcri.edu.au alicia.oshlack@mcri.edu.auAlicia Oshlack 1Murdoch Childrens Research Institute, Royal Children’s Hospital, Victoria, Australia2School of Bio-Sciences, University of Melbourne, Victoria, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Alicia OshlackFor correspondence: nadia.davidson@mcri.edu.au alicia.oshlack@mcri.edu.au

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy045 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101125

    1. Now published in GigaScience doi: 10.1093/gigascience/giy048 Alex Di Genova 1Facultad de Ingenier´ıa y Ciencias, Universidad Adolfo Iba´n˜ez, Santiago, Chile.2Mathomics Bioinformatics Laboratory, Center for Mathematical Modeling, University of Chile, Av. Blanco Encalada 2120, 7th floor, Santiago, Chile.3Inria Grenoble Rhonˆe-Alpes, 655, Avenue de l’Europe, 38334 Montbonnot, France.4CNRS, UMR5558, Universite´ Claude Bernard Lyon 1, 43, Boulevard du 11 Novembre 1918, 69622 Villeurbanne, France.5Fondap Center for Genome Regulation, Av. Blanco Encalada 2085, 3rd floor, Santiago, Chile.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteGonzalo A. Ruz 1Facultad de Ingenier´ıa y Ciencias, Universidad Adolfo Iba´n˜ez, Santiago, Chile.6Center of Applied Ecology and Sustainability (CAPES), Santiago, Chile.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteMarie-France Sagot 3Inria Grenoble Rhonˆe-Alpes, 655, Avenue de l’Europe, 38334 Montbonnot, France.4CNRS, UMR5558, Universite´ Claude Bernard Lyon 1, 43, Boulevard du 11 Novembre 1918, 69622 Villeurbanne, France.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: marie-france.sagot@inria.fr amaass@dim.uchile.clAlejandro Maass 2Mathomics Bioinformatics Laboratory, Center for Mathematical Modeling, University of Chile, Av. Blanco Encalada 2120, 7th floor, Santiago, Chile.5Fondap Center for Genome Regulation, Av. Blanco Encalada 2085, 3rd floor, Santiago, Chile.7Department of Mathematical Engineering, University of Chile, Av. Blanco Encalada 2120, 5th floor, Santiago, Chile.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: marie-france.sagot@inria.fr amaass@dim.uchile.cl

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy048 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101129 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101130

    1. Now published in GigaScience doi: 10.1093/gigascience/giy053 Fu-Hao Lu 1John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteNeil McKenzie 1John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteGeorge Kettleborough 2The Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteDarren Heavens 2The Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteMatthew D. Clark 2The Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteMichael W. Bevan 1John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Michael W. Bevan

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy053 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101147 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101148

    1. Now published in GigaScience doi: 10.1093/gigascience/giy057 Bérénice Batut 1Bioinformatics Group, Department of Computer Science, University of Freiburg, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Bérénice BatutFor correspondence: berenice.batut@gmail.com pierre.peyret@uca.frKévin Gravouil 2Université Clermont Auvergne, INRA, MEDIS, F-63000 Clermont-Ferrand, France3Université Clermont Auvergne, CNRS, LMGE, F-63000 Clermont-Ferrand, France4Université Clermont Auvergne, CNRS, LIMOS, F-63000 Clermont-Ferrand, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Kévin GravouilClémence Defois 2Université Clermont Auvergne, INRA, MEDIS, F-63000 Clermont-Ferrand, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteSaskia Hiltemann 5Department of Bioinformatics, Erasmus University Medical Center, Rotterdam, 3015 CE, NetherlandsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Saskia HiltemannJean-François Brugère 2Université Clermont Auvergne, INRA, MEDIS, F-63000 Clermont-Ferrand, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteEric Peyretaillade 2Université Clermont Auvergne, INRA, MEDIS, F-63000 Clermont-Ferrand, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this sitePierre Peyret 2Université Clermont Auvergne, INRA, MEDIS, F-63000 Clermont-Ferrand, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: berenice.batut@gmail.com pierre.peyret@uca.fr

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy057 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101163 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101164 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101165

    1. Now published in GigaScience doi: 10.1093/gigascience/giy061 Kingshuk Mukherjee 1Department of Computer and Information Science and Engineering, University of Florida, GainesvilleFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: kingdgp@ufl.eduDarshan Washimkar 2Department of Computer Science, Colorado State University, Fort CollinsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMartin D. Muggli 2Department of Computer Science, Colorado State University, Fort CollinsFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteLeena Salmela 3Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of HelsinkiFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteChristina Boucher 1Department of Computer and Information Science and Engineering, University of Florida, GainesvilleFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy061 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101178 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101179 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101180

    1. Now published in GigaScience doi: 10.1093/gigascience/giy059 Swati Parekh 1Anthropology & Human Genomics, Department of Biology II, Ludwig-Maximilians University, 82152 Martinsried, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Swati ParekhChristoph Ziegenhain 1Anthropology & Human Genomics, Department of Biology II, Ludwig-Maximilians University, 82152 Martinsried, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Christoph ZiegenhainBeate Vieth 1Anthropology & Human Genomics, Department of Biology II, Ludwig-Maximilians University, 82152 Martinsried, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Beate ViethWolfgang Enard 1Anthropology & Human Genomics, Department of Biology II, Ludwig-Maximilians University, 82152 Martinsried, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteInes Hellmann 1Anthropology & Human Genomics, Department of Biology II, Ludwig-Maximilians University, 82152 Martinsried, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Ines Hellmann

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy059 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101183 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101184

    1. Now published in GigaScience doi: 10.1093/gigascience/giy064 Jonathan R. Belyeu 1Department of Human Genetics, University of Utah, Salt Lake City, UT2USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UTFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Jonathan R. BelyeuThomas J. Nicholas 1Department of Human Genetics, University of Utah, Salt Lake City, UT2USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UTFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Thomas J. NicholasBrent S. Pedersen 1Department of Human Genetics, University of Utah, Salt Lake City, UT2USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UTFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Brent S. PedersenThomas A. Sasani 1Department of Human Genetics, University of Utah, Salt Lake City, UT2USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UTFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Thomas A. SasaniJames M. Havrilla 1Department of Human Genetics, University of Utah, Salt Lake City, UT2USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UTFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteStephanie N. Kravitz 1Department of Human Genetics, University of Utah, Salt Lake City, UT2USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UTFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Stephanie N. KravitzMegan E. Conway 1Department of Human Genetics, University of Utah, Salt Lake City, UTFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteBrian K. Lohman 1Department of Human Genetics, University of Utah, Salt Lake City, UT2USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UTFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Brian K. LohmanAaron R. Quinlan 1Department of Human Genetics, University of Utah, Salt Lake City, UT2USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT3Department of Biomedical Informatics, University of Utah, Salt Lake City, UT+To whom correspondence should be addressedFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Aaron R. QuinlanRyan M. Layer 1Department of Human Genetics, University of Utah, Salt Lake City, UT2USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT+To whom correspondence should be addressedFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Ryan M. Layer

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy064 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101187 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101188

    1. Now published in GigaScience doi: 10.1093/gigascience/giy034 Florence McLean 1Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Florence McLeanDuncan Berger 1Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Duncan BergerDominik R. Laetsch 1Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Dominik R. LaetschHillel T. Schwartz 2Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Hillel T. SchwartzMark Blaxter 1Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Mark Blaxter

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy034 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101075 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101076

    1. Now published in GigaScience doi: 10.1093/gigascience/giy033 Aaron Pomerantz 1Department of Integrative Biology, University of California, Berkeley, CA, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: Pomerantz_aaron@berkeley.edu stprost@stanford.eduNicolás Peñafiel 2Centro de Investigación de la Biodiversidad y Cambio Climático (BioCamb) e Ingeniería en Biodiversidad y Recursos Genéticos, Facultad de Ciencias de Medio Ambiente, Universidad Tecnológica Indoamérica, Machala y Sabanilla, Quito, EcuadorFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteLucas Bustamante 3Tropical Herping, Quito, EcuadorFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFrank Pichardo 3Tropical Herping, Quito, EcuadorFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteLuis A. Coloma 4Centro Jambatu de Investigación y Conservación de Anfibios, Fundación Otonga, Quito, EcuadorFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteCésar L. Barrio-Amorós 5Doc Frog Expeditions, Uvita, Costa RicaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteDavid Salazar-Valenzuela 2Centro de Investigación de la Biodiversidad y Cambio Climático (BioCamb) e Ingeniería en Biodiversidad y Recursos Genéticos, Facultad de Ciencias de Medio Ambiente, Universidad Tecnológica Indoamérica, Machala y Sabanilla, Quito, EcuadorFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteStefan Prost 1Department of Integrative Biology, University of California, Berkeley, CA, USA6Program for Conservation Genomics, Department of Biology, Stanford University, Stanford, CA, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: Pomerantz_aaron@berkeley.edu stprost@stanford.edu

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy033 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101071 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101072

    1. Now published in GigaScience doi: 10.1093/gigascience/giy037 Haotian Teng 1Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD 4072 Australia Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: haotian.teng@uq.net.au l.coin@imb.uq.edu.auMinh Duc Cao 1Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD 4072 Australia Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Minh Duc CaoMichael B. Hall 1Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD 4072 Australia Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Michael B. HallTania Duarte 1Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD 4072 Australia Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Tania DuarteSheng Wang 2Department of Human Genetics, University of Chicago, IL 60637, United States Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteLachlan J.M. Coin 1Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, QLD 4072 Australia Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Lachlan J.M. CoinFor correspondence: haotian.teng@uq.net.au l.coin@imb.uq.edu.au

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy037 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101102 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101103 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101104

    1. Now published in GigaScience doi: 10.1093/gigascience/giy015 Harry A. Thorpe 1The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AYFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Harry A. ThorpeSion C. Bayliss 1The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AYFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Sion C. BaylissSamuel K. Sheppard 1The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AYFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Samuel K. SheppardEdward J. Feil 1The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AYFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Edward J. Feil

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy015 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101033 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101034

    1. Abstract

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy025 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101059 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101060

    1. Now published in GigaScience doi: 10.1093/gigascience/giy004 C Foulon aBrain Connectivity and Behaviour Group, Sorbonne Universities, Paris France.bFrontlab, Institut du Cerveau et de la Moelle épinière (ICM), UPMC UMRS 1127, Inserm U 1127, CNRS UMR 7225, Paris, France.cCentre de Neuroimagerie de Recherche CENIR, Groupe Hospitalier Pitié-Salpêtrière, Paris, France.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for C FoulonFor correspondence: hd.chrisfoulon@gmail.com michel.thiebaut@gmail.comL Cerliani aBrain Connectivity and Behaviour Group, Sorbonne Universities, Paris France.bFrontlab, Institut du Cerveau et de la Moelle épinière (ICM), UPMC UMRS 1127, Inserm U 1127, CNRS UMR 7225, Paris, France.cCentre de Neuroimagerie de Recherche CENIR, Groupe Hospitalier Pitié-Salpêtrière, Paris, France.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteS Kinkingnéhun aBrain Connectivity and Behaviour Group, Sorbonne Universities, Paris France.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteR Levy bFrontlab, Institut du Cerveau et de la Moelle épinière (ICM), UPMC UMRS 1127, Inserm U 1127, CNRS UMR 7225, Paris, France.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteC Rosso cCentre de Neuroimagerie de Recherche CENIR, Groupe Hospitalier Pitié-Salpêtrière, Paris, France.dAbnormal Movements and Basal Ganglia team, Inserm U 1127, CNRS UMR 7225, Sorbonne Universities, UPMC Univ Paris 06, Institut du Cerveau et de la Moelle épinière, ICM, Paris, FranceeAPHP, Urgences Cérébro-Vasculaires, Groupe Hospitalier Pitié-Salpêtrière, Paris, France.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteM Urbanski aBrain Connectivity and Behaviour Group, Sorbonne Universities, Paris France.bFrontlab, Institut du Cerveau et de la Moelle épinière (ICM), UPMC UMRS 1127, Inserm U 1127, CNRS UMR 7225, Paris, France.fMedicine and Rehabilitation Department, Hôpitaux de Saint-Maurice, Saint-Maurice, France.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteE Volle aBrain Connectivity and Behaviour Group, Sorbonne Universities, Paris France.bFrontlab, Institut du Cerveau et de la Moelle épinière (ICM), UPMC UMRS 1127, Inserm U 1127, CNRS UMR 7225, Paris, France.cCentre de Neuroimagerie de Recherche CENIR, Groupe Hospitalier Pitié-Salpêtrière, Paris, France.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteM Thiebaut de Schotten aBrain Connectivity and Behaviour Group, Sorbonne Universities, Paris France.bFrontlab, Institut du Cerveau et de la Moelle épinière (ICM), UPMC UMRS 1127, Inserm U 1127, CNRS UMR 7225, Paris, France.cCentre de Neuroimagerie de Recherche CENIR, Groupe Hospitalier Pitié-Salpêtrière, Paris, France.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: hd.chrisfoulon@gmail.com michel.thiebaut@gmail.com

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy004 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101000 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101001

    1. Now published in GigaScience doi: 10.1093/gigascience/giy009 Rachael E. Workman 1Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAlexander M. Myrka 2Department of Biological Sciences, University of Toronto Scarborough, Toronto, Ontario, Canada and Department of Cell & Systems Biology, University of Toronto, Toronto, Ontario, CanadaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteElizabeth Tseng 4Pacific Biosciences, Menlo Park, California, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteG. William Wong 3Department of Physiology and Center for Metabolism and Obesity Research, Johns Hopkins University School of Medicine, Baltimore, MD, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteKenneth C. Welch Jr.2Department of Biological Sciences, University of Toronto Scarborough, Toronto, Ontario, Canada and Department of Cell & Systems Biology, University of Toronto, Toronto, Ontario, CanadaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Kenneth C. Welch Jr.Winston Timp 1Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Winston Timp

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy009 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101011 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101012

    1. Now published in GigaScience doi: 10.1093/gigascience/gix136 Quan H. Nguyen 1CSIRO Agriculture, 306 Carmody Road, St. Lucia, 4067, QLD, Australia2Divisions of Genomics of Development and Disease, Institute for Molecular Bioscience, University of Queensland, 306 Carmody Road, St. Lucia, 4067, QLD, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Quan H. NguyenRoss L. Tellam 1CSIRO Agriculture, 306 Carmody Road, St. Lucia, 4067, QLD, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Ross L. TellamMarina Naval-Sanchez 1CSIRO Agriculture, 306 Carmody Road, St. Lucia, 4067, QLD, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Marina Naval-SanchezLaercio R. Porto-Neto 1CSIRO Agriculture, 306 Carmody Road, St. Lucia, 4067, QLD, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Laercio R. Porto-NetoWilliam Barendse 3School of Veterinary Science, University of Queensland, Gatton, 4343, QLD, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for William BarendseAntonio Reverter 1CSIRO Agriculture, 306 Carmody Road, St. Lucia, 4067, QLD, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteBenjamin Hayes 4The Queensland Alliance for Agriculture and Food Innovation (QAAFI), University of Queensland, 4067, QLD, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJames Kijas 1CSIRO Agriculture, 306 Carmody Road, St. Lucia, 4067, QLD, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for James KijasBrian P. Dalrymple 1CSIRO Agriculture, 306 Carmody Road, St. Lucia, 4067, QLD, Australia5Institute of Agriculture, The University of Western Australia, Perth, Western Australia, 6009, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Brian P. DalrympleFor correspondence: brian.dalrymple@uwa.edu.au

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix136 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101018 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101019

    1. A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix135 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100980 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100979

    1. A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giy002 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100989 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100988

    1. Now published in GigaScience doi: 10.1093/gigascience/gix107 Joshua Lynch 1Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, MN, USA2Department of Ecology, Evolution, and Behavior, University of Minnesota, Minneapolis, MN, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteKaren Tang 1Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, MN, USA2Department of Ecology, Evolution, and Behavior, University of Minnesota, Minneapolis, MN, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteSambhawa Priya 1Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, MN, USA2Department of Ecology, Evolution, and Behavior, University of Minnesota, Minneapolis, MN, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJoanna Sands 1Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, MN, USA2Department of Ecology, Evolution, and Behavior, University of Minnesota, Minneapolis, MN, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMargaret Sands 1Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, MN, USA2Department of Ecology, Evolution, and Behavior, University of Minnesota, Minneapolis, MN, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteEvan Tang 1Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, MN, USA2Department of Ecology, Evolution, and Behavior, University of Minnesota, Minneapolis, MN, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteDan Knights 4Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN, USA5Biotechnology Institute, University of Minnesota, Minneapolis, MN, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: blekhman@umn.edu dknights@umn.eduRan Blekhman 1Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, MN, USA2Department of Ecology, Evolution, and Behavior, University of Minnesota, Minneapolis, MN, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: blekhman@umn.edu dknights@umn.edu

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix107 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100893

    1. A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix102 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100883 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100884

    1. Now published in GigaScience doi: 10.1093/gigascience/gix117 Zhikai Liang 1Center for Plant Science Innovation, Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, 68503, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Zhikai LiangPiyush Pandey 2Department of Biological System Engineering, University of Nebraska-Lincoln, Lincoln, 68503, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteVincent Stoerger 3Plant Phenotyping Facilities Manager, University of Nebraska-Lincoln, Lincoln, 68503, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteYuhang Xu 4Department of Statistics, University of Nebraska-Lincoln, Lincoln, 68503, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteYumou Qiu 4Department of Statistics, University of Nebraska-Lincoln, Lincoln, 68503, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteYufeng Ge 2Department of Biological System Engineering, University of Nebraska-Lincoln, Lincoln, 68503, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJames C. Schnable 1Center for Plant Science Innovation, Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, 68503, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for James C. Schnable

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix117 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100930

    1. Now published in GigaScience doi: 10.1093/gigascience/gix134 Robert Bukowski 1Bioinformatics Facility, Institute of Biotechnology, Cornell University, Ithaca, NY 14853Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteXiaosen Guo 2BGI-Shenzhen, Shenzhen 518083, China3Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, DK-2200 Copenhagen, DenmarkFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteYanli Lu 4Maize Research Institute, Sichuan Agricultural University, Wenjiang 611130, Sichuan, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteCheng Zou 5Institute of Crop Science, Chinese Academy of Agricultural Sciences/National Key Facilities for Crop Gene Resource and Genetic Improvement, Beijing 100081, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteBing He 2BGI-Shenzhen, Shenzhen 518083, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteZhengqin Rong 2BGI-Shenzhen, Shenzhen 518083, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteBo Wang 2BGI-Shenzhen, Shenzhen 518083, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteDawen Xu 2BGI-Shenzhen, Shenzhen 518083, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteBicheng Yang 2BGI-Shenzhen, Shenzhen 518083, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteChuanxiao Xie 5Institute of Crop Science, Chinese Academy of Agricultural Sciences/National Key Facilities for Crop Gene Resource and Genetic Improvement, Beijing 100081, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteLongjiang Fan 6Institute of Crop Science and Institute of Bioinformatics, Department of Agronomy, Zhejiang University, Hangzhou 310058, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteShibin Gao 4Maize Research Institute, Sichuan Agricultural University, Wenjiang 611130, Sichuan, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteXun Xu 2BGI-Shenzhen, Shenzhen 518083, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteGengyun Zhang 2BGI-Shenzhen, Shenzhen 518083, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteYingrui Li 2BGI-Shenzhen, Shenzhen 518083, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteYinping Jiao 7Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJohn Doebley 8Department of Genetics, University of Wisconsin, Madison, Wisconsin, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJeffrey Ross-Ibarra 9Department of Plant Sciences, University of California, Davis, California, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteVince Buffalo 9Department of Plant Sciences, University of California, Davis, California, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteM. Cinta Romay 10Institute for Genomic Diversity, Cornell University, Ithaca, NY 14853Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteEdward S. Buckler 10Institute for Genomic Diversity, Cornell University, Ithaca, NY 1485311US Department of Agriculture-Agricultural Research Service, Ithaca, NY 14853Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteYunbi Xu 5Institute of Crop Science, Chinese Academy of Agricultural Sciences/National Key Facilities for Crop Gene Resource and Genetic Improvement, Beijing 100081, China12International Maize and Wheat Improvement Center (CIMMYT), El Batan 56130, Texcoco, MexicoFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteDoreen Ware 7Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteQi Sun 1Bioinformatics Facility, Institute of Biotechnology, Cornell University, Ithaca, NY 14853Find this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix134 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100974 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100975

    1. Now published in GigaScience doi: 10.1093/gigascience/gix096 Valerie De Anda 1Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de México, 70-275, Coyoacán 04510 México D.F.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Valerie De AndaFor correspondence: valdeanda@ciencias.unam.mx bcontreras@eead.csic.es souza@unam.mxIcoquih Zapata-Peñasco 2Dirección de Investigación en Transformación de Hidrocarburos. Instituto Mexicano del Petróleo, Eje Central Lázaro Cárdenas, Norte 152, Col. San Bartolo Atepehuacan, 07730, MéxicoFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAugusto Cesar Poot-Hernandez 3Departamento de Ingeniería de Sistemas Computacionales y Automatización. Sección de Ingeniería de Sistemas Computacionales. Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteLuis E. Eguiarte 1Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de México, 70-275, Coyoacán 04510 México D.F.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteBruno Contreras-Moreira 4Estación Experimental de Aula Dei, Consejo Superior de Investigaciones Científicas (EEAD-CSIC), Avda. Montañana, 1005, Zaragoza 50059, Spain5Fundación ARAID, calle María de Luna 11, 50018 Zaragoza, SpainFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: valdeanda@ciencias.unam.mx bcontreras@eead.csic.es souza@unam.mxValeria Souza 1Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de México, 70-275, Coyoacán 04510 México D.F.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: valdeanda@ciencias.unam.mx bcontreras@eead.csic.es souza@unam.mx

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix096 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100872 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100873

    1. Now published in GigaScience doi: 10.1093/gigascience/gix097 Aleksey V. Zimin 1Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD2Institute for Physical Sciences and Technology, University of Maryland, College Park, MDFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteDaniela Puiu 1Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MDFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteRichard Hall 3Pacific Biosciences, Menlo Park, CAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteSarah Kingan 3Pacific Biosciences, Menlo Park, CAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteSteven L. Salzberg 1Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD5Departments of Biomedical Engineering, Computer Science, and Biostatistics, Johns Hopkins University, Baltimore, MDFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Steven L. SalzbergFor correspondence: salzberg@jhu.edu

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix097 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100877 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100879 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.100878 Reviewer 4: http://dx.doi.org/10.5524/REVIEW.100880

    1. Now published in GigaScience doi: 10.1093/gigascience/gix090 Brent S. Pedersen 1Department of Human Genetics, University of Utah, Salt Lake City, UT3USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UTFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteRyan L. Collins 4Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA6Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute, Cambridge, MA7Program in Bioinformatics and Integrative Genomics, Division of Medical Sciences, Harvard Medical School, Boston, MA.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteMichael E. Talkowski 4Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA5Department of Neurology, Harvard Medical School, Boston, MA6Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute, Cambridge, MA7Program in Bioinformatics and Integrative Genomics, Division of Medical Sciences, Harvard Medical School, Boston, MA.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteAaron R. Quinlan 1Department of Human Genetics, University of Utah, Salt Lake City, UT2Department of Biomedical Informatics, University of Utah, Salt Lake City, UT3USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UTFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix090 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100844 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100845

    1. Now published in GigaScience doi: 10.1093/gigascience/gix056 Gregory D. Marquart , Bethesda, MD 20892Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteKathryn M. Tabor , Bethesda, MD 20892Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteMary Brown , Bethesda, MD 20892Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteHarold A. Burgess , Bethesda, MD 20892Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Harold A. Burgess

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix056 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100800

    1. Now published in GigaScience doi: 10.1093/gigascience/gix078 Ling-Hong Hung 1Institute of Technology, Box 358426, University of Washington, Tacoma, WA.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteKaiyuan Shi 1Institute of Technology, Box 358426, University of Washington, Tacoma, WA.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteMigao Wu 1Institute of Technology, Box 358426, University of Washington, Tacoma, WA.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteWilliam Chad Young 2Department of Statistics, Box 354320, University of Washington, Seattle, WA.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteAdrian E. Raftery 2Department of Statistics, Box 354320, University of Washington, Seattle, WA.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteKa Yee Yeung 1Institute of Technology, Box 358426, University of Washington, Tacoma, WA.Find this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix078 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100807 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100809

    1. Now published in GigaScience doi: 10.1093/gigascience/gix083 Michael P. Pound 1The School of Computer Science, University of Nottingham,UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Michael P. PoundAlexandra J. Burgess 2The School of Biosciences, University of Nottingham, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Alexandra J. BurgessMichael H. Wilson 3Centre for Plant Sciences, Faculty of Biological Sciences, University of Leeds, Leeds, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Michael H. WilsonJonathan A. Atkinson 2The School of Biosciences, University of Nottingham, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Jonathan A. AtkinsonMarcus Griffiths 2The School of Biosciences, University of Nottingham, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAaron S. Jackson 1The School of Computer Science, University of Nottingham,UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAdrian Bulat 1The School of Computer Science, University of Nottingham,UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteyorgos Tzimiropoulos 1The School of Computer Science, University of Nottingham,UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for yorgos TzimiropoulosDarren M. Wells 2The School of Biosciences, University of Nottingham, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Darren M. WellsErik H. Murchie 2The School of Biosciences, University of Nottingham, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Erik H. MurchieTony P. Pridmore 1The School of Computer Science, University of Nottingham,UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Tony P. PridmoreAndrew P. French 1The School of Computer Science, University of Nottingham,UK2The School of Biosciences, University of Nottingham, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Andrew P. FrenchFor correspondence: andrew.p.french@nottingham.ac.uk

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix083 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100812 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100813

    1. Now published in GigaScience doi: 10.1093/gigascience/gix054

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix054 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100763 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100764

    1. Now published in GigaScience doi: 10.1093/gigascience/gix061

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix061 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100765 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100766 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.100768

    1. Now published in GigaScience doi: 10.1093/gigascience/gix040 Peter R. Sternes aThe Australian Wine Research Institute, PO Box 197, Glen Osmond, South Australia, 5064Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteDanna Lee aThe Australian Wine Research Institute, PO Box 197, Glen Osmond, South Australia, 5064Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteDariusz R. Kutyna aThe Australian Wine Research Institute, PO Box 197, Glen Osmond, South Australia, 5064Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteAnthony R. Borneman aThe Australian Wine Research Institute, PO Box 197, Glen Osmond, South Australia, 5064bDepartment of Genetics and Evolution, University of Adelaide, South Australia. Australia. 5000Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Anthony R. BornemanFor correspondence: anthony.borneman@awri.com.au

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix040 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100725 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100726 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.100727

    1. Abstract

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix043 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100730 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100732

    1. Now published in GigaScience doi: 10.1093/gigascience/gix045

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix045 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100738 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100737

    1. Now published in GigaScience doi: 10.1093/gigascience/gix042

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix042 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100741

    1. Now published in GigaScience doi: 10.1093/gigascience/gix048

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix048 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100749 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100750

    1. Now published in GigaScience doi: 10.1093/gigascience/gix032

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix032 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100681 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100682 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.100719

    1. Now published in GigaScience doi: 10.1093/gigascience/gix015

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix015 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100648 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100649

    1. Now published in GigaScience doi: 10.1093/gigascience/gix010

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix010 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100575 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100577

    1. A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giw018 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100546 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100547 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.100544

    1. Background

      Reviewer 2. Simon Eskildsen

      This data note describes a freely available repository of skull-stripped T1 weighted MRI data from adult individuals aged 21-45. In total 125 images and masks are available and may enable researchers to improve skull-stripping and validate algorithms. Sharing data this way is truly the way forward. The paper is well-written and concise. The authors demonstrate the benefit of the data by comparing to commonly used skull-stripping methods. I have only very minor issues/questions: Out of 125 subjects 66 had a psychiatric diagnosis (past or present). This does not seem to be a sample of the general population. Perhaps the authors could explain why more than half of the subjects had a diagnosis? Also, if you scan 125 random subjects, you're bound to find some brain abnormalities. Perhaps 125 was not the initial sample size? Was there a selection process? In the validation it should be mentioned that beast-library-1.1 consists of only 10 MRIs from young individuals.

      Missing references at page 2 lines 33 and 34, page 4 line 12, and page 5 line 62. Page 4 line 17: should this be "Figure 1"? BET is "Brain Extraction Tool" (not "Technique"). Perhaps LPI orientation should be explained?

    2. Abstract

      Reviewer 1. Xin Di

      In the manuscript titled "The Preprocessed Connectomes Project Repository of Manually Corrected Skull-stripped T1-weighted Anatomical MRI Data", the authors presented a manually corrected skull striped T1 MRI image repository, and demonstrated the usage of this data to test different skull stripping methods. I think the manually corrected images library is a valuable resource, because it is generally considered "gold standard" which could be used to test other automated methods. The manuscript is also well written. I have some minor comments on this manuscript:

      1. In the abstract, it was said that "This procedure is necessary for calculating brain volume and for improving the quality of other image processing steps." I don't think that skull-stripping is really a necessary step for image preprocessing. I use SPM. And I don't typically do skullstripping, unless coregistration of functional and anatomical images failed. I do agree that skullstripping could help to prevent mis-coregistrations, and is particularly helpful for preprocessing of large-scale datasets. But it is hard to say it is necessary. In addition, if non-brain tissues such as bones and fats have been modelling in the segmentation step (e.g. SPM segmentation includes six tissue types), is it necessary to perform a separate skull-striping step before segmentation?
      2. I have downloaded the The NFBS skull-stripped repository. As has been described in the manuscript, it contains single subject's data of raw T1 image, skull-stripped image, and brain mask. Are there some probability maps generated as "NFBS BEaST library" that were used for BEaST skull-stripping? If so, can the authors also make the probability maps available?
      3. In several occasions, references were missing and marked as [?]: Lines 33 and 34, page 2 of 9, "Neurofeedback Study (NFB) [?]" Lines 34 and 35, page 2 of 9, "a deep phenotypic assessment on the rst and second visits [?]" Line 12, page 4 of 9, "FreeSurfer software package [?]". Line 62, page 5 of 9, "plots using the ggplot2 package [?]".
    3. Now published in GigaScience doi: 10.1186/s13742-016-0150-5 Benjamin Puccio 1Computational Neuroimaging Lab, Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute for Psychiatric Research, 140 Old Orangeburg Rd, 10962, Orangeburg, NY, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJames P Pooley 2Center for the Developing Brain, Child Mind Institute, 445 Park Ave, 10022, New York, NY, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJohn S Pellman 1Computational Neuroimaging Lab, Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute for Psychiatric Research, 140 Old Orangeburg Rd, 10962, Orangeburg, NY, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteElise C Taverna 1Computational Neuroimaging Lab, Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute for Psychiatric Research, 140 Old Orangeburg Rd, 10962, Orangeburg, NY, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteR Cameron Craddock 1Computational Neuroimaging Lab, Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute for Psychiatric Research, 140 Old Orangeburg Rd, 10962, Orangeburg, NY, USA2Center for the Developing Brain, Child Mind Institute, 445 Park Ave, 10022, New York, NY, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: ccraddock@nki.rfmh.org

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1186/s13742-016-0150-5), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

    1. Now published in GigaScience doi: 10.1186/s13742-016-0152-3

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1186/s13742-016-0152-3 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100511 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100508 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.100507

    1. Now published in GigaScience doi: 10.1186/s13742-016-0149-y Víctor Resco de Dios 1Department of Crop and Forest Sciences-AGROTECNIO Center, Universitat de Lleida, 25198 Lleida, Spain.2Hawkesbury Institute for the Environment, University of Western Sydney, Richmond, NSW 2753, Australia.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Víctor Resco de DiosFor correspondence: v.rescodedios@gmail.comArthur Gessler 3Swiss Federal Institute for Forest, Snow and Landscape Research WSL Long-term Forest Ecosystem Research (LWF), 8903 Birmensdorf, Switzerland.4Institute for Landscape Biogeochemistry, Leibniz-Centre for Agricultural Landscape Research (ZALF), 15374 Müncheberg, Germany.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Arthur GesslerJuan Pedro Ferrio 1Department of Crop and Forest Sciences-AGROTECNIO Center, Universitat de Lleida, 25198 Lleida, Spain.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Juan Pedro FerrioJosu G Alday 1Department of Crop and Forest Sciences-AGROTECNIO Center, Universitat de Lleida, 25198 Lleida, Spain.5School of Environmental Sciences, University of Liverpool, Liverpool, L69 3GP, UK.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Josu G AldayMichael Bahn 6Institute of Ecology, University of Innsbruck, 6020 Innsbruck, Austria.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Michael BahnJorge del Castillo 1Department of Crop and Forest Sciences-AGROTECNIO Center, Universitat de Lleida, 25198 Lleida, Spain.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteSébastien Devidal 7Ecotron Européen de Montpellier, UPS 3248, CNRS, Campus Baillarguet, 34980, Montferrier-sur-Lez, France.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteSonia García-Muñoz 8IMIDRA, Finca “El Encín”, 28800 Alcalá de Henares, Madrid, Spain.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteZachary Kayler 4Institute for Landscape Biogeochemistry, Leibniz-Centre for Agricultural Landscape Research (ZALF), 15374 Müncheberg, Germany.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Zachary KaylerDamien Landais 7Ecotron Européen de Montpellier, UPS 3248, CNRS, Campus Baillarguet, 34980, Montferrier-sur-Lez, France.Find this author on Google ScholarFind this author on PubMedSearch for this author on this sitePaula Martín 1Department of Crop and Forest Sciences-AGROTECNIO Center, Universitat de Lleida, 25198 Lleida, Spain.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteAlexandru Milcu 7Ecotron Européen de Montpellier, UPS 3248, CNRS, Campus Baillarguet, 34980, Montferrier-sur-Lez, France.9CNRS, Centre d’Ecologie Fonctionnelle et Evolutive (CEFE UMR 5175), 1919 route de Mende, F-34293 Montpellier, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Alexandru MilcuClément Piel 7Ecotron Européen de Montpellier, UPS 3248, CNRS, Campus Baillarguet, 34980, Montferrier-sur-Lez, France.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteKarin Pirhofer-Walzl 4Institute for Landscape Biogeochemistry, Leibniz-Centre for Agricultural Landscape Research (ZALF), 15374 Müncheberg, Germany.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteOlivier Ravel 7Ecotron Européen de Montpellier, UPS 3248, CNRS, Campus Baillarguet, 34980, Montferrier-sur-Lez, France.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteSerajis Salekin 10Erasmus Mundus Master on Mediterranean Forestry and Natural Resources Management, Universitat de Lleida, 25198 Lleida, Spain.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteDavid T Tissue 2Hawkesbury Institute for the Environment, University of Western Sydney, Richmond, NSW 2753, Australia.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteMark G Tjoelker 2Hawkesbury Institute for the Environment, University of Western Sydney, Richmond, NSW 2753, Australia.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteJordi Voltas 1Department of Crop and Forest Sciences-AGROTECNIO Center, Universitat de Lleida, 25198 Lleida, Spain.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteJacques Roy 7Ecotron Européen de Montpellier, UPS 3248, CNRS, Campus Baillarguet, 34980, Montferrier-sur-Lez, France.Find this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1186/s13742-016-0149-y ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100506 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100505

    1. nucleotide

      Reviewer 3. Takeru Nakazato

      I have reviewed this manuscript with integrity, but I'm a little confused about it because I usually use NCBI PubMed/GenBank data. If my points are off the mark, please point them out.

      1. In NCBI PubMed, the nucleotide sequence entries referenced in the article are listed in PubMed data as external DB links (although not perfect), and by extracting these, the relationship between the PubMed and Nucleotide entries can be extracted. The NCBI website also provides these links from Nucleotide in the Related information section (e.g. https://pubmed.ncbi.nlm.nih.gov/19193256/). I found that the ePMC website also has a link in the Data section for nucleotide sequence entries referenced in the paper (e.g., https://europepmc.org/article/MED/19193256). Do you use any of these external links in ePMC data in this work? I think it is very difficult to extract nucleotide IDs by text mining, especially since Nucleotide sequence IDs are not in a fixed format. I think these links will be a great help in doing text mining.
      2. In NCBI PubMed, MeSH keywords are assigned to each article for indexing the literature. MeSH keywords also include country keywords (e.g. https://pubmed.ncbi.nlm.nih.gov/19193256/). In ePMC Is it possible to use keywords like MeSH in ePMC? Do you have any opinions about using such country keywords?
      3. I found some great statistics and visualizations of this data on the site the authors provide about it. I would be happy to show these in this manuscript as a result of this work, but please follow the journal's policies and precedents.
      4. Do the authors think that users should reuse the created data for this product? Or is it recommended that users create their own data using the creation program? If the former, what is your plan for the frequency of updating the data?
      5. In Figure 1, I felt that it would be easier for the reader to understand if I emphasized (by changing the line or fill of the box) whether the data in each step is Nucleotide data, literature data, or ID pairs extracted from those data.
    2. Linking

      Reviewer 2. Michael Fire.

      The idea of curating this dataset is both important, and can contribute to the scientific community. Additionally, in most parts, the paper is well written. However, the manuscript has some major issue that needs to solve before it would be ready for publication. The Good:

      • The dataset presented in the paper can be very useful to the scientific community
      • The authors invested many efforts in making the paper reproducible. Both the project's code and dataset are open
      • The project has a friendly and helpful web interface. Things that need to improve: Major Issues:
      • Although this paper is not a standard research paper, the article is missing more context to other works. I believe the context of the manuscript will be more explicit by adding a Related Work section that provides an overview of other papers that generated similar datasets.
      • Most of the analysis is based on the PubMed datasets, which is a relatively small dataset. There are other open datasets that I think it is important to use to get a fuller picture, such as Microsoft Academic, AMiner, Semantic Scholar, bioXiv, and arXiv. I understand that performing a full-text search on these datasets can be challenging. However, the paper's results need to be validated by using some of these datasets.
      • The manuscript's quality needs to be improved (text, figures' resolutions, etc.). Minor Issues:
      • In my opinion, the overall structure of the paper can be improved.
      • There is no need to explain the FAIR data principle
      • Using Microsoft Academic dataset can assist in mapping between author to a unique id
      • Mapping between an institute or location to a country can be more accurately done by utilizing geolocation code packages, such as geopy

      Re-review After reading the submitted "policy paper," the goal and contributions of this study and dataset became clear. I believe this dataset and data visualization interface can be beneficial for the academic community. I think the paper will be ready for publication after fixing the following minor issues:

      • It is very challenging to understand Figure 1. I recommend adding additional figures that better explains how each part of the system work with more details.
      • Even though the quality of the figures was improved, they are still of low quality, and it is hard to read the figures, especially Figure 4.
      • The paper needs to be carefully proofread for punctuation mistakes.
    3. Background

      Reviewer 1. Gianmaria Silvello.

      Figure 1 is not readable. The sampling process lowered the quality of the image and made the text not readable. Please, use vectorial images (e.g., PDF or EPS). Anyhow, I could understand the process from the descriptive text. Figure 2 is readable, but the quality is relatively low. Nevertheless, I do not think this figure is instrumental; it is a simple logical schema of a relational database. Uploading the SQL dump or the SQL schema in an external repository and reference it in the paper would be enough. The sentence "we imported an ORACLE SQL data warehouse that employs state-of-the-art database technologies" is not very clear. What do you mean by "imported a data warehouse"? Could you provide more details about the DBMS you used? To my understanding, you designed a relational model. You then implemented it in SQL using an Oracle DBMS (MySQL? or the native Oracle DBMS?) to store and query the data. Check page 9 description and add some details to avoid confusion. This is not a key passage though, I am sure that you handled the data somehow, and the paper's focus is not on this. "reference integrity between the tables was checked" -> This is a "weird" statement. Reference integrity is a constraint to guarantee the consistency of data. You "check the integrity" when you store the data in the DB, and if it is not validated, the data cannot be stored in the DB. So, I do not understand this sentence that is not explained anymore. Indeed, the paragraph continues by talking about the SQL queries to count the paper identifiers (this is not directly linked to reference integrity, or at least you should explain what you mean). Recent analysis about issues related to ORCID ids and duplication of ids can be found here: http://ceurws.org/Vol-2816/paper10.pdf Table 1 is not that useful; it can be described in the text that you did the experiment and verified the discrepancies between open access publications and paywalled papers. It is a well-known problem, and it is not analyzed in-depth here. I think you can get rid of it without affecting the quality of the paper. Figure 4, like all the other images, is not readable. I directly accessed the Webapp, which works fine. The paper is well-written, and the data collection is fine. Nevertheless, the article is a bit anti-climatic because there, not many provided insights. You discuss what we can do with the data, but little analysis of the data themselves. We could use some more in-depth analysis and a few insights about the achievable outcomes we can get using the collected data. Also, more about the best practices that should be defined in the field would be a nice addition.

      Re-review:

      The authors comprehensively answered to this reviewer comments. The quality of the paper is improved and the modifications are in line with what was expected. I have no further observations.

    4. Abstract

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giab084), where the paper and peer reviews are published openly under a CC-BY 4.0 license. It also has a companion paper published alongside it here: https://doi.org/10.1093/gigascience/giab085

    1. Now published in GigaScience doi: 10.1186/s13742-016-0146-1 Á.G. Muñoz 1Atmospheric and Oceanic Sciences (AOS)/Geophysical Fluid Dynamics Laboratory (GFDL), Princeton University, NJ, USA.2International Research Institute for Climate and Society (IRI), Earth Institute, Columbia University, NY. USA3Latin American Observatory for Climate Events, Centro de Modelado Científico (CMC), Universidad del Zulia, Venezuela.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Á.G. MuñozM. C. Thomson 2International Research Institute for Climate and Society (IRI), Earth Institute, Columbia University, NY. USA4Mailman School of Public Health Department of Environmental Health Sciences. Columbia University. NY. USA5WHO Collaborating Centre (US 306) on Early Warning Systems for Malaria and other Climate Sensitive Diseases. USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteL. Goddard 2International Research Institute for Climate and Society (IRI), Earth Institute, Columbia University, NY. USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteS. Aldighieri 6International Health Regulations / Epidemic Alert and Response, and Water Borne Diseases (IR). Communicable Diseases and Health Analysis Department (CHA). PAHO. DC. USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1186/s13742-016-0146-1 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100498

    1. Now published in GigaScience doi: 10.1186/s13742-016-0137-2 Minh Duc Cao 1Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, St Lucia, QLD 4072 Brisbane, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteDevika Ganesamoorthy 1Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, St Lucia, QLD 4072 Brisbane, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAlysha G. Elliott 1Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, St Lucia, QLD 4072 Brisbane, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteHuihui Zhang 1Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, St Lucia, QLD 4072 Brisbane, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMatthew A. Cooper 1Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, St Lucia, QLD 4072 Brisbane, AustraliaFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1186/s13742-016-0137-2 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100473 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100471

    1. Now published in GigaScience doi: 10.1186/s13742-016-0136-3 Simo V. Zhang 1School of Informatics and Computing, Indiana University, Bloomington, Indiana 47405Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: simozhan@indiana.eduLuting Zhuo 1School of Informatics and Computing, Indiana University, Bloomington, Indiana 47405Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteMatthew W. Hahn 1School of Informatics and Computing, Indiana University, Bloomington, Indiana 474052Department of Biology, Indiana University, Bloomington, Indiana 47405Find this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1186/s13742-016-0136-3 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100463 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100462

    1. Nations Convention

      Reviewer 2. Michael Fire. The paper is interesting and easy to follow, and the interactive charts are useful and make it easy to interact with the data. Moreover, the implications and contributions of the study are clear. However, the paper needs to be carefully be edit and proofread in order to make it ready for publication.

      Minor Issues:

      • The paper needs to be carefully proofread and edited
      • The layout of Figure 1 isn't good. I recommend using subplots with subtitles. Remove of countries with very low percentage. In addition, the caption of this figure needs to be improved.
      • Figure 2. Beside the countries colored with pink. It is hard to distinguish between the values of different countries. Although this figure emphasis that the ratio is mostly even, I personally think, it would be better to use different color scheme, or normalize the values, so the difference among countries will more noticeable.
      • The paper's layout has unused free space in pages 5 and 6.
      • The quality of figures throughout the paper need to be improved. Typos (some examples):
      • Abstract: CBD) -> CBD,
      • P. 2: "for user checks [4]. (Countires"
      • P. 3: originated. (Note
      • P. 3: "use" -> "use."
      • Figure 2: graph 3.4 -> Graph 3.4 (why call this Graph?)
      • P. 8 "DSI (/country "
      • P. 8: figure 3 -> Figure 3
      • Figure captions are with different formats.
      • Figure 4 caption is in a mislocated
      • P. 12: Figure 2 and 5 -> Figures 2 and 5
    2. The United

      Reviewer 1. Takeru Nakazato

      The authors extract country names from gene sequence data in "traditional" ENA database and compare them with the country names of submitters to provide data-driven proof of the myth about the relationship between providers-users relationships for digital sequence information. I also implicitly believe in that myth. This verification has important implications for future handling of DSI. There is much room for debate in this manuscript, but I think it is important to ask the world as soon as possible. I make some points about this manuscript.

      1. The Nagoya Protocol has prompted a major shift in the use of genetic resources. I think authors should also discuss the year of registration for digital sequence information. (Specifically, before and after the Nagoya Protocol) (It is desirable to provide actual data, but this time I will only seek the views of the author.)

      2. The authors focus only on ENA's "traditional" gene sequence information (data corresponding to NCBI's GenBank). Regarding digital sequence information, NGS is currently being actively used, and data is also being accumulated in SRA (sequence read archive) at a tremendous pace. In addition, recent MinION devices can acquire digital sequence information on the spot without taking genetic resources out of the country. The lack of this data is a flaw in this manuscript, but at least the authors can discuss it in the manuscript, I think.

      3. In recent years, there has been a movement called "museomics" that extracts DNA from museum specimens and obtains digital sequence information. Museomics contributes to Ancient DNA and Taxonomic clarification. ENA/GenBank/DDBJ also has a "specimen_voucher" field, which already has hundreds of thousands of digital sequence information with this data (https://doi.org/10.3897/biss.5.73787). Does this have any effect on this study? Should DSI from museum specimen be excluded from the statistical processing in this study? (Authors don't have to mention this in the manuscript)

    3. Abstract

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giab085), where the paper and peer reviews are published openly under a CC-BY 4.0 license. It also has a companion paper published alongside it here: https://doi.org/10.1093/gigascience/giab084

    1. Now published in GigaScience doi: 10.1186/s13742-016-0135-4 Stephen R. Piccolo 1Department of Biology, Brigham Young University, Provo, UT, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: stephen_piccolo@byu.eduMichael B. Frampton 2Department of Computer Science, Brigham Young University, Provo, UT, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1186/s13742-016-0135-4 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100623 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100622

    1. Now published in GigaScience doi: 10.1186/s13742-016-0111-z

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1186/s13742-016-0111-z ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100374 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100375

    1. Findings

      Reviewer 3. Tom Madden; Minor Essential Revisions: The descriptions for TBLASTN and TBLASTX need to be swapped in Table 1 (rows 4 and 5).

      Discretionary Revisions This is a well structured and informative article about supporting sequence similarity searching (BLAST) in Galaxy. The Results section contains a number of example applications such as "Assessing a de novo assembly". This section does have references to some of the tools that would be useful in this task (e.g., seq_filter_by_id), but I find these examples somewhat abstract. Additionally, a reader unfamiliar with Galaxy might not be convinced of the advantages of using Galaxy for these tasks as opposed to simply running the searches themselves. I'd suggest that the authors add a concrete example of one of their use-cases.

      For example, the authors could show how their tools could find the globin cluster for some mammal. This should include all accessions and other information so that a reader could reproduce the result. This example could be added as supplementary material if need be.

    2. Background

      Reviewer 2. Gianmauro Cuccuru

      The authors present their effort to integrate the command line NCBI BLAST+ tool suite into the Galaxy platform, providing a full set of wrappers, BLAST related tools and datatype definitions.

      The manuscript is clearly written and includes several useful use-cases and workflows combining the tools within Galaxy. The tools are not available through a public server for testing, but the authors crafted an excellent virtual machine providing a complete Galaxy server with the BLAST+ tools preinstalled. In my opinion the work represents a valuable addition to the software resources available to the Galaxy community, hence my recommendation to its publication.

    3. Abstract

      Reviewer 1. Stian Soiland-Reyes

      This review is also available at Description The article describes a mechanism to add the BLAST+ functionality to the Galaxy workflow system. This is a very useful feature, and so in principle I would want to see this article published. I do however have some concerns with the aspects of reproducibility and documentation, which are detailed below.

      Major Compulsory Revisions

      I am afraid I will have to ask for major compulsory revisions as I was unable to reproduce any the claims of the paper.

      1: Docker image is not BLAST enabled

      p5.

      the command docker ... start a BLAST enabled Galaxy instance I tried the docker image. It starts up fine, and presents a Galaxy that includes a list of BLAST tools - so the BLAST tools have been installed. The docker instance is however not BLAST enabled, as the BLAST tools requires further configuration/download of the external BLAST reference database to align against. This procedure is loosely documented at https://registry.hub.docker.com/u/bgruening/galaxy-blast/ - but I was unable to follow through with this installation as it was quite complicated and seems to require manual downloading and configuration of many GB of reference data spread over more than 300 files. I was assuming that a docker image would be 'usable out of the box' - but this is far from the truth in this case. Accessing "NCBI BLAST+ database info" gives an empty dropdown list in The article mentions that the public Galaxy instance usegalaxy.com does not provide the BLAST tools by default due to concerns over computational load - but I am also worried if it could be because configuring the BLAST+ tools is quite a complicated job. The article does not mention at all the excessive amount of system administraton that is required in order to finalize the BLAST installation, and the docker image does not provide any helper scripts to assist with this. In fact, the example database configuration files uses a totally different path, e.g. /depot/data2/galaxy/blastdb/nt/nt.chunk - while the docker image would require these under /data/nt/nt.chunk. The article or Docker README does not mention which subset of the databases would commonly need to be downloaded - or even the fact that all of the numbered fragments need to be downloaded. The dataset referenced from the example configuration, e.g. nt.chunk and wgs.chunk do not exist on ftp://ftp.ncbi.nlm.nih.gov/blast/db/ - only non-chunk version exist. I tried to download a subset of the datasets from ftp://ftp.ncbi.nlm.nih.gov/blast/db/ stain@biggie-utopic:/galaxy_store/data/blast_databases$ ls human_genomic.00.nhd nt.00.nhd refseq_genomic.148.nhr refseq_protein.00.pin refseq_protein.15.pnd wgs.00.nhi human_genomic.00.nhi nt.00.nhi refseq_genomic.148.nin refseq_protein.00.pnd refseq_protein.15.pni wgs.00.nhr human_genomic.00.nhr nt.00.nhr refseq_genomic.148.nnd refseq_protein.00.pni refseq_protein.15.pog wgs.00.nin human_genomic.00.nin nt.00.nin refseq_genomic.148.nni refseq_protein.00.pog refseq_protein.15.ppd wgs.00.nnd human_genomic.00.nnd nt.00.nnd refseq_genomic.148.nog refseq_protein.00.ppd refseq_protein.15.ppi wgs.00.nni human_genomic.00.nni nt.00.nni refseq_genomic.148.nsd refseq_protein.00.ppi refseq_protein.15.psd wgs.00.nog human_genomic.00.nog nt.00.nog refseq_genomic.148.nsi refseq_protein.00.psd refseq_protein.15.psi wgs.00.nsd human_genomic.00.nsd nt.00.nsd refseq_genomic.148.nsq refseq_protein.00.psi refseq_protein.15.psq wgs.00.nsi human_genomic.00.nsi nt.00.nsi refseq_genomic.148.tar.gz refseq_protein.00.psq refseq_protein.15.tar.gz wgs.00.nsq human_genomic.00.nsq nt.00.nsq refseq_genomic.nal refseq_protein.15.phr refseq_protein.pal wgs.nal human_genomic.nal nt.nal refseq_protein.00.phr refseq_protein.15.pin wgs.00.nhd and configured these in blastdb.loc according to the Docker readme. The readme says: you need to add the paths to your blast databases and they need to look like /export/swissprot/swissprot but I have followed the instructions three lines above which mounted the datasets at /data - hence I used /data/ instead of /export. Some consistency would help here. stain@biggie-utopic:/galaxy_store/data/blast_databases$ grep -v ^# /tmp/galaxy/galaxy-central/tool-data/blastdb*loc /tmp/galaxy/galaxy-central/tool-data/blastdb.loc:nt_02_Dec_2009 nt 02 Dec 2009 /data/nt /tmp/galaxy/galaxy-central/tool-data/blastdb.loc:wgs_30_Nov_2009 wgs 30 Nov 2009 /data/wgs/wgs /tmp/galaxy/galaxy-central/tool-data/blastdb.loc:refseq_genomic_148 refseq 148 /data/refseq_genomic /tmp/galaxy/galaxy-central/tool-data/blastdb.loc: /tmp/galaxy/galaxy-central/tool-data/blastdb.loc: /tmp/galaxy/galaxy-central/tool-data/blastdb_p.loc:nt_02_Dec_2009 nt 02 Dec 2009 /data/nt /tmp/galaxy/galaxy-central/tool-data/blastdb_p.loc:wgs_30_Nov_2009 wgs 30 Nov 2009 /data/wgs/wgs /tmp/galaxy/galaxy-central/tool-data/blastdb_p.loc:refseq_protein refseq protein /data/refseq_genomic /tmp/galaxy/galaxy-central/tool-data/blastdb_p.loc: /tmp/galaxy/galaxy-central/tool-data/blastdb_p.loc: A BLAST Data Manager is available at at https://github.com/peterjc/galaxy_blast/ - in theory this can download and populate the blastdb data table. This is mentioned as "Future work" in the article, so presumably it is not yet production ready. This data manager does not appear under Data Libraries in the docker image, and it is not included in the installation at https://registry.hub.docker.com/u/bgruening/galaxy-blast/dockerfile/

      2: Galaxy Tool Shed not working with the docker image

      The article says: The recently published Galaxy Tool Shed [9] allows anyone hosting a Galaxy instance to install tools and defined dependencies with a few clicks right from the Galaxy web application itself. I am unable to verify this claim using the provided Docker image. I am unable to install any tools from the Galaxy Tool Shed from the web interface of the Docker image. I am logged in as admin@galaxy.org according to the instructions, but if I go to Admin -> Search and browse tool sheds http://localhost:8080/admin_toolshed/browse_tool_sheds and click the dropdown list for "Browse valid sheds", this hangs for a while before failing with "Can't find the server". On the console I get many error messages like: URLError: <urlopen error [Errno -2] Name or service not known> tool_shed.util.shed_util_common ERROR 2015-01-19 10:22:08,698 Error attempting to get tool shed status for installed repository ncbi_blast_plus:

      <urlopen error [Errno -2] Name or service not known> Traceback (most recent call last): File "lib/tool_shed/util/shed_util_common.py", line 772, in get_tool_shed_status_for_installed_repository encoded_tool_shed_status_dict = common_util.tool_shed_get( app, tool_shed_url, url ) File "lib/tool_shed/util/common_util.py", line 345, in tool_shed_get response = urlopener.open( uri ) File "/usr/lib/python2.7/urllib2.py", line 404, in open response = self._open(req, data) File "/usr/lib/python2.7/urllib2.py", line 422, in _open '_open', req) File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain result = func(*args) File "/usr/lib/python2.7/urllib2.py", line 1214, in http_open return self.do_open(httplib.HTTPConnection, req) File "/usr/lib/python2.7/urllib2.py", line 1184, in do_open raise URLError(err) Inspecting the internal frame I see the link is http://toolshed.g2.bx.psu.edu/repository/browse_valid_categories?galaxy_url=http://localhost:8080 I somehow feel that toolshed.g2.bx.psu.edu will try to connect to my galaxy instance at http://localhost:8080 - which is not going to work. The actual toolshed site is unavailable http://www.downforeveryoneorjustme.com/toolshed.g2.bx.psu.edu It's not just you! http://toolshed.g2.bx.psu.edu looks down from here. ..so this might be an temporary network problem that is not related to the "localhost" bit. I am nevertheless unable to verify the claim of the ease of using the Tool Shed to install the BLAST+ tool because of this. If it is true that the Docker image does not work with the Galaxy Tool Shed - which now host most of the tools required in a Galaxy installation, then this should be duely noted in the article and the README of the Docker image.

      3: Supporting data has no usage instructions

      The article links to https://github.com/peterjc/galaxy_blast as the supporting data - but this website has no instructions on how to install/use with Galaxy or the Galaxy docker image. I could execute .travis.yml "by hand" - but I do not feel this is sufficient documentaton for a supporting data set. I have therefore not been able to verify that the supporting data actually supports the article, beyond inspecting the Travis-CI build logs at https://travis-ci.org/peterjc/galaxy_blast/builds .. which except for a single error seem to be verifying the tools. https://travis-ci.org/peterjc/galaxy_blast/builds/45137901 OperationalError: (OperationalError) unable to open database file None None

      Minor Essential Revisions

      4: Provenance and update issue not addressed

      Workflow systems are commonly praised in bioinformatics because they enable reproducibility and sharing of analytical pipelines. One challenge in this aspect is that the domain of bioinformatics commonly update software tools and reference datasets. In fact, BLAST+ 2.2.30 was released just 6 weeks ago [ ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/ ] and the latest BLAST reference dataset taxdb.tar.gz was updated today [ ftp://ftp.ncbi.nlm.nih.gov/blast/db/ ]. The blast FTP site does not seem to contain any version number of the dataset, and as datasets are split over multiple files, it would be difficult to know if you have downloaded half an old dataset and half a new dataset. (A new version of the dataset could be released in the middle of your lengthy download). The sanity of the dataset could potentially be verified by downloading the *.md5 files both before and after the large download -- but this should be automated by a script to be done correctly. I would say the main challenges are in this respect, assuming a successful workflow run using the described Galaxy BLAST tools: a) Which version of the BLAST tool was used? b) Which reference data set was used? c) Which version? d) Was the install complete/sane? (ref. MD5 files and updates) e) Are there any later versions of tool or reference data set? How do I keep my Galaxy instance up to date? (going through the lengthy database download+config again?) f) How can a Galaxy workflow using BLAST+ be shared with another Galaxy instance (supposedly easily started with Docker), when manual download and configuration of databases are required? Your article does not mention how the BLAST+ tool for Galaxy addresses any of these concerns. The use of the Galaxy Tool Shed should in theory allow for automatic updating of the tool - and I believe the BLAST tools would output log information that includes at least version number. I am worried that the dataset description that is entered manually by the system administrator into /galaxy-central/tool-data/blastdb.loc and friends contain an element of "manual versioning", as the example contains

      nt_02_Dec_2009 nt 02 Dec 2009 /depot/data2/galaxy/blastdb/nt/nt.chunk

      wgs_30_Nov_2009 wgs 30 Nov 2009

      /depot/data2/galaxy/blastdb/wgs/wgs.chunk This sounds very errorprone, and as older datasets not available from NCBI, definitiely not reproducible. I would expect the article to at least acknowledge these concerns - and ideally for the tooling to support this (e.g. through the BLAST Data Manager and additional provenance output from the BLAST+ tools, e.g. in W3C PROV format).

      5: Results workflows unavavailable

      We now describe some use-cases and workflows combining these tools within Galaxy. The first two examples:

      • Assessing a de novo assembly
      • Finding genes of interest in a de novo assembly do not link to any actual Galaxy workflow descriptions, but are only described as bullet point lists. The descriptions do not link to any examples for "Upload ** sequence" or of the expected outputs. "Identifying candidate genes clusters" is described in more detail, but the workflow is only included as a visual Figure, and not in the supporting data or uploaded/linked to an external repository like the mentioned myExperiment. The citation for this workflow, [22] http://dx.doi.org/10.1021/ja501630w is not Open Access, and I was required to use the University of Manchester. The article does not mention the word "workflow" once, and do not seem to contain any data citations for the workflow, only for the sequence. The only supporting information provided at http://pubs.acs.org/doi/suppl/10.1021/ja501630w is a PDF with tables, graphs and sequence views. Again the word "workflow" is not mentioned. A direct link to the workflow definition should be included for all three examples.

      Discretionary Revisions

      Spelling corrections for product/company names p2:

      • MyExperiment -> myExperiment
      • Amazon Inc. -> Amazon AWS
      • "Cloud Computing" -> Cloud Computing
      • Galaxy "CloudMan" -> Galaxy CloudMan
      • "Galaxy Tool Shed" -> Galaxy Tool Shed p5:
      • "such FASTA format" -> "such as FASTA format"
      • Docker Inc. -> Docker Inc. (https://www.docker.com/)
      • Galaxy "CloudMan" -> Galaxy CloudMano Useful hyperlinks:
      • whose functional tests are then run. -> whose ..then run (https://travis-ci.org/peterjc/galaxy_blast).
      • The Galaxy-P project -> The Galaxy-P project https://usegalaxyp.org/
    4. Now published in GigaScience doi: 10.1186/s13742-015-0080-7 Peter J. A. Cock 1Information and Computational Sciences, James Hutton Institute, Invergowrie, Dundee DD2 5DA, Scotland, UKFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Peter J. A. CockFor correspondence: peter.cock@hutton.ac.ukJames E. Johnson 2Minnesota Supercomputing Institute, University of Minnesota, 599 Walter Library, 117 Pleasant St. SE, 55455, Minneapolis, Minnesota, USAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for James E. JohnsonNicola Soranzo 4CRS4, Loc. Piscina Manna, 09010 Pula (CA), ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Nicola Soranzo

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1186/s13742-015-0080-7), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

    1. Now published in GigaScience doi: 10.1186/s13742-015-0101-6 Judith Risse 1Edinburgh Genomics, School of Biological Sciences, The King’s Buildings, The University of Edinburgh, EH9 3FLFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMarian Thomson 1Edinburgh Genomics, School of Biological Sciences, The King’s Buildings, The University of Edinburgh, EH9 3FLFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteGarry Blakely 2Institute of Cell Biology, School of Biological Sciences, The King’s Buildings, The University of Edinburgh, EH9 3BFFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteGeorgios Koutsovoulos 3Institute of Evolutionary Biology, School of Biological Sciences, The King’s Buildings, The University of Edinburgh, EH9 3FLFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMark Blaxter 1Edinburgh Genomics, School of Biological Sciences, The King’s Buildings, The University of Edinburgh, EH9 3FL3Institute of Evolutionary Biology, School of Biological Sciences, The King’s Buildings, The University of Edinburgh, EH9 3FLFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMick Watson 1Edinburgh Genomics, School of Biological Sciences, The King’s Buildings, The University of Edinburgh, EH9 3FL4The Roslin Institute, University of Edinburgh, Easter Bush, EH25 9RGFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1186/s13742-015-0101-6 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100346 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100344

    1. Now published in GigaScience doi: 10.1093/gigascience/giab060 David Johnson 1Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX1 3QG, Oxford, United Kingdom2Department of Informatics and Media, Uppsala University, Box 513, 751 20 Uppsala, SwedenFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for David JohnsonAlejandra Gonzalez-Beltran 1Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX1 3QG, Oxford, United Kingdom5Science and Technology Facilities Council, Scientific Computing Department, Rutherford Appleton Laboratory, Harwell Campus, Didcot, OX11 0QX, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Alejandra Gonzalez-BeltranKenneth Haug 3European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom6Genome Research Limited, Wellcome Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Saffron Walden CB10 1RQ, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Kenneth HaugMassimiliano Izzo 1Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX1 3QG, Oxford, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Massimiliano IzzoMartin Larralde 7Structural and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstraße 1, 69117 Heidelberg, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Martin LarraldeThomas N. Lawson 8School of Biosciences, University of Birmingham, Edgbaston, Birmingham, B15 2TT, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Thomas N. LawsonAlice Minotto 4Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Alice MinottoPablo Moreno 3European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Pablo MorenoVenkata Chandrasekhar Nainala 3European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Venkata Chandrasekhar NainalaClaire O’Donovan 3European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Claire O’DonovanLuca Pireddu 9Distributed Computing Group, CRS4: Center for Advanced Studies, Research & Development in Sardinia, Pula, ItalyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Luca PiredduPierrick Roger 10CEA, LIST, Laboratory for Data Analysis and Systems’ Intelligence, MetaboHUB, Gif-Sur-Yvette F-91191, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Pierrick RogerFelix Shaw 4Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Felix ShawChristoph Steinbeck 11Cheminformatics and Computational Metabolomics, Institute for Analytical Chemistry, Lessingstr. 8, 07743 Jena, GermanyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Christoph SteinbeckRalf J. M. Weber 8School of Biosciences, University of Birmingham, Edgbaston, Birmingham, B15 2TT, United Kingdom12Phenome Centre Birmingham, University of Birmingham, Edgbaston, Birmingham, B15 2TT, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Ralf J. M. WeberSusanna-Assunta Sansone 1Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX1 3QG, Oxford, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Susanna-Assunta SansoneFor correspondence: susanna-assunta.sansone@oerc.ox.ac.uk philippe.rocca-serra@oerc.ox.ac.ukPhilippe Rocca-Serra 1Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, OX1 3QG, Oxford, United KingdomFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Philippe Rocca-SerraFor correspondence: susanna-assunta.sansone@oerc.ox.ac.uk philippe.rocca-serra@oerc.ox.ac.uk

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giab060), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: Kevin Menden In the paper "ISA API: An open platform for interoperable life science experimental metadata" Johnson et al. present a extensive Python API for reading, writing and handling of metadata in the ISA format. The authors describe the increasing use of the ISA formats and thus indicate the need for better tools to handle such data. The article is well written and good to understand. The ISA tools package contains extensive functionality and a solid documentation. Furthermore, it can be installed with PyPi and Bioconda, which I think should be standard nowadays. The authors furthermore provide a docker image, which is nice. All in all, I think the ISA tools package is a genuinely useful piece of software that is well written, which is why I recommend this manuscript for publication in GigaScience. However, a few minor things should be changed. Personally, I would like to know whether support for the upload to additional databases will be added in the future - this could be noted in the text. The article contains many figures with only little content. I would strongly advise to merge some of these figures into a smaller subset of figures to improve the readability. The authors spend a considerable amount of text on download statistics - something that in my opinion is not really that relevant for the software package. I would recommend to considerably shorten this section. On a similar note, the methods section basically just describes how these download statistics were handled. Considering this article describes a software package, it might be more useful to the reader (and reviewer) to elaborate a bit on how the software is written, maintained, structured, tested - and related things.

      Reviewer 2: Manuel Holtgrewe The authors describe the Python library "isatools" for accessing ISA (investigation study assay) files in ISA-tab and ISA-json format. The authors start by sdescribing their previous work around the ISA data model and file formats in detail. They then describe their implementation and the features of their API. They highlight the extensibility and efficiency of their object oriented model. They describe in detail how meta data can be curated in ontologies and that currently extensions are underway for the assisted creation of study meta data. They then refer to early adopters and a stable and growing community. They conclude with the statement that their library is "a major step forward in making the ISA framework open and interoperable".

      General Remarks

      Overall, we have found the ISA data model and ISA-tab data format to be very useful in our own work. However, there are some issues with the software including apparent bugs as described below. In 2018, my colleagues and me considered using ISA-API in our project for ISA-Tab parsing but the problems and the lack of automated tests made us roll our own (also see below). Overall, the authors make a clear point, the paper is well-written. However, the software appears to be unfinished and some work is required to make it suitable for publication.

      Major Issues

      1. The ISA-creator and Bio-GraphIIn are cited as "helped grow the ISA community of users". The authors should offer evidence for this as (a) by our own experience ISA-creator is very hard to use and this is also reflected by the expressed opinion on ISA-creator by anyone I have met so far who has used it and (b) it is not possible to validate how Bio-GraphII has helped grow the community as the website linked to in the cited article is not available anymore and no source code is available, e.g., on Github. The Google groups forum has less than 10 threads per year, with 2 in 2020 so far and one in 2019. The authors should balance these counts with their "PyPi" download counts statistics.
      2. The authors should cite other published APIs for ISA file formats, e.g., AltamISA.
      • Kuhring, Mathias, et al. "AltamISA: a Python API for ISA-Tab files." Journal of Open Source Software 4.40 (2019): 1610.
      1. The authors should show proof for "efficiency" of their object-oriented model, e.g., by comparing import efficiency with that of altamISA. I'm raising this point as some users raised questions on efficiency when loading/writing data files in the ISA-API Github Issues.
      2. The authors write that development is in progress but it appears from the Github code frequency graph that development has mostly stalled since 2018.
      3. The authors should explain in more detail how stable their API is and what the limitations and assumptions are. In my opinion, one important point in data import and export is looking how data looks after a "round-trip", e.g., import ISA-Tab, followed by export ISA-Tab. I have done this on the official ISA data sets (https://github.com/ISA-tools/ISAdatasets, commit f20be4f83dc5f6f7ec419bfd634efba3177e4ae4). Here are the (to me unexpected results for official example data): (a) On BII-I-1, whole columns disappear such as the first "Material Type" column, (b) All other datasets fail to parse and parsing crashes with Python exceptions. I think the authors should work on these points. It cannot be judged whether the software can be published this point. The software appears unfinished and some more work has to go into it to allow for publication. ## Minor Issues
      4. The authors should provide more automated tests for their software. In 2018 when we tried out the package we found some inconsistencies and problems but found it hard to fix bugs in the large body of software because of the lack of comprehensive automated tests.

      Reviewer 3; Chris Hunter The manuscript is well written and coherent, it provides a nice balance of historical context of ISA-Tools and the current release of the ISA-API. As a biologist and Biocurator I can attest to the importance of simple to use tools for curation of datasets, and the ISA-creator has been well used by the community over the years. The addition of the ISA-API should allow for more repositories to incorporate the use of ISA formats as both import and export formats for datasets. I have to admit that my lack of experience as a developer means that I am in no position to actually test the API's functionality so I cannot comment as to the technical suitability of the implementation or even whether is works or not! I have been requested to review the manuscript with specific reference to the original reviewer 2 concerns: Reviewer 2 Comment 1"The ISA-creator and Bio-GraphIIn are cited as "helped grow the ISA community of users". The authors should offer evidence for this as; (a) by our own experience ISA-creator is very hard to use and this is also reflected by the expressed opinion on ISA-creator by anyone I have met so far who has used it and (b) it is not possible to validate how Bio-GraphII has helped grow the community as the website linked to in the cited article is not available anymore and no source code is available, e.g., on GitHub. The Google groups forum has less than 10 threads per year, with 2 in 2020 so far and one in 2019. The authors should balance these counts with their "PyPi" download counts statistics." My comment: I believe the authors have addressed the primary concern about the evidence of continued growth in the ISA user community with the detailed description of the PyPi download statistics. The issue of ISA-creators user experience by the reviewer and anecdotal comment of all who have used it, is unfounded and in-fact if true, adds to the argument for the implementation of the ISAAPI as a means to allow a wider developer-base to improve the ISA-creation experience. Reviewer 2 comment 2. "The authors should cite other published APIs for ISA file formats, e.g., AltamISA."

      • Kuhring, Mathias, et al. "AltamISA: a Python API for ISA-Tab files." Journal of Open Source Software 4.40 (2019): 1610. My comment: The authors have made appropriate changes and included the suggested reference. Reviewer 2 comment 3. "The authors should show proof for "efficiency" of their object-oriented model, e.g., by comparing import efficiency with that of AltamISA. I'm raising this point as some users raised questions on efficiency when loading/writing data files in the ISA-API GitHub Issues." My comment: The authors have replaced the word efficiency with coherent in the manuscript to clarify the meaning in the relevant paragraph. However I'm not sure they have addressed the principle of the concern raised by reviewer 2, i.e. how does the ISA-API compare to other existing models in terms of efficiency? As I have no idea how to measure "efficiency" of a model I'm not convinced this is a valid request from reviewer 2. Reviewer 2 comment 4. "The authors write that development is in progress but it appears from the GitHub code frequency graph that development has mostly stalled since 2018." My comment: I agree with the authors rebuttal of this point, simply looking at GitHub commits is not a suitable measure. Reviewer 2 comment 5. "The authors should explain in more detail how stable their API is and what the limitations and assumptions are. In my opinion, one important point in data import and export is looking how data looks after a "round-trip", e.g., import ISA-Tab, followed by export ISA-Tab. I have done this on the official ISA data sets (https://github.com/ISA-tools/ISAdatasets, commit f20be4f83dc5f6f7ec419bfd634efba3177e4ae4). Here are the (to me unexpected results for official example data): (a) On BII-I-1, whole columns disappear such as the first "Material Type" column, (b) All other datasets fail to parse and parsing crashes with Python exceptions." My comment: Unfortunately my lack of the required skill set to make any sort of tests myself means I am not in a position to adjudicate on this point! I do agree with the authors rebuttal that they cannot assess the reviewers issues based on the minimal information provided in the review. As the authors point out, documentation can always be improved, and 1 such improvement might be to include a "round-trip" example as the reviewer 2 has attempted to show that one can take a valid ISA formatted input, convert it to say SRA format, and back to ISA format using the API and that the input and output ISA formats do indeed match. Reviewer 2 comment Minor issue 1. "The authors should provide more automated tests for their software. In 2018 when we tried out the package we found some inconsistencies and problems but found it hard to fix bugs in the large body of software because of the lack of comprehensive automated tests." My comment: I think this reviewers comment is un-related to the review, they are talking about a version of the tool that is approximately 3 years old, not the current version that they are meant to be reviewing. Despite the irrelevance, the authors have responded by adding text to highlight the Test Driven Development approach taken in the project. With the one caveat already mentioned, i.e. I am unable to actually test the code so I am reliant on the other reviewer to have covered that aspect of the review, I believe the manuscript is suitable for publication as the authors have adequately addressed all of the reviewer 2 comments with the possible exception of improved documentation.
    1. Now published in GigaScience doi: 10.1186/s13742-015-0046-9

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1186/s13742-015-0046-9 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100015 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100017

    1. Now published in GigaScience doi: 10.1186/2047-217X-3-22 Joshua Quick 1Institute of Microbiology and Infection, University of Birmingham, Birmingham, UK 2NIHR Surgical Reconstruction and Microbiology Research Centre, University of Birmingham, Birmingham, UK Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteAaron Quinlan 3University of Virginia, Virginia, US Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteNicholas J Loman 1Institute of Microbiology and Infection, University of Birmingham, Birmingham, UK Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: n.j.loman@bham.ac.uk

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1186/2047-217X-3-22), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100173 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100172

    1. Abstract

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1186/2047-217X-3-3), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      These peer reviews were as follows:

      Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100110 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100109

  2. Dec 2021
    1. Although

      Reviewer 1. Aboozar Soorni

      Have any claims of performance been sufficiently tested and compared to other commonly-used packages? no Additional Comments Are there (ideally real world) examples demonstrating use of the software? No Additional Comments Is automated testing used or are there manual steps described so that the functionality of the software can be verified? No

      Recommendation: Accept

      Reviewer 2. Weiwen Wang This manuscript is easy to read, with good writing and details of methods. By comparing the long-read, short-read and hybrid assembly, this manuscript found out the best approach to assemble plastid and mitochondrial genome. Additionally, authors considered the multiple structures of plastid and mitochondrial genome, and providing a new and carefully designed method to assemble and assess the complex mitochondrial genome. While this manuscript represents a solid work and it was interesting to read it, I have some minor concerns which should be fixed to improve the quality of the manuscript. Line 240-247: Maybe it could be easier to understand if authors can number each contig. For example, “The assembly graph suggests the typical quadripartite structure of a LSC (contig 1-7) as the larger circle in the graph…...”. In some figures, authors numbered the contigs, but in some did not. Also, in the figure 2, why does the SSC region also have almost 2x coverage (1.88x)? Line 253-258: In the figure 3, it is three contigs (92 kb, 38 kb and 5kb) rather than two contigs (81 kb and 92 kb) that this manuscript described. I guess authors put the wrong figure? Line 283-285: It is a smart method to clearly show the assembly details. Line 385-387: Does it mean that the black segment (edge 11) in Figure 12 is consisted of two highly similar (or the same) regions? Have authors tried to do a simple BLAST to confirm this?

    2. Abstract

      A version of this preprint has been published in the Open Access journal GigaByte (see paper https://doi.org/10.46471/gigabyte.36), where the paper and peer reviews are published openly under a CC-BY 4.0.

    1. Abstract

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giab081), where the paper and peer reviews are published openly under a CC-BY 4.0.

      These peer reviews were as follows:

      Reviewer 1. Qi Zhou

      Mueller et al. presented a high-quality genome and annotation of tufted duck with combined long-read and short-read techniques here. Tufted duck shows a different susceptibility to avian influenza A viruses (AIV) compared to mallards that share the habitat. So besides adding a new avian genomic resource, tufted duck genome may facilitate the research into the genetic basis of AIV infection. Overall, I think the genome is of high quality, but I do have several comments below:

      The introduction is largely devoted to the great advantage of PacBio over Illumina techniques in elucidating the non-model species' genome feature. This is not needed for the authors of Gigascience. I suggest the authors provide more information in the tufted duck. From the previous studies, how diverged in terms of million years and sequence level between the tufted duck vs. mallard? What is the phylogenetic position of tufted duck in Galloanserase? Are there any lab or field studies of tufted duck's susceptibility to AIV? What is the potential genetic cause? Also, since it is known in mallards that RIG-I is responsible for the AVI response, is this gene then intact or how is this gene expressed in the newly presented tufted duck genome? The analyses part needs to show the repeat content of tufted duck and its comparison to other avian genomes. Particularly, the repeat content of Z and W chromosome. Did the author look into the centromere or telomere sequences? Are Z and W chromosomes assembled into two intact sequences? If so, evidence is needed to show that there is no chimeric assembly between Z and W, or other autosome sequences, as it is mentioned in the paper that 'most of the genome separated into haplotypes'. Tissue-specific expression part: Here what does 'supported' gene mean exactly? Just to make sure, the authors means 'genes' or 'transcripts'? 'Stringtie2 may discard single-exon transcript model..' Did the author find that the Stringtie2 results generally have a much lower proportion of single-exon transcripts compared to say, Iso-seq data? 'The average number of transcripts in the long-read pipeline often almost matched..': I am confused here that Figure 5 shows the opposite result that the supported PacBio genes are much lower in number than those produced by Illumina reads. Any results to support this claim? 'This distribution is much more balanced in the long-read pipeline..' here the authors may suggest that PacBio iso-seq recover more alternative splicing transcripts compared to Illumina data. But it is unclear that the supported genes of iso-seq are so much lower in number than those of Illumina, which may be caused by the relatively lower coverage of iso-seq? So I would conclude at least in this study, both techniques are complementary to each other, rather than one performing better than the other. How are the TEs annotated by these small RNAs, as apart from miRNAs, there should be large portions of piRNAs mapping to TEs. Re-Review: For question 3: The author need to explain more about why they think Figure S1 shows there is no chimeric assembly between Z and W chromosome. As Figure S1 is just a Hi-C matrix plot, among the submitted materials, I also cannot find the legend explaining the figure.

    2. Background

      Reviewer 2. Joshua Peñalba This data note by Mueller et al. describes the high-quality, chromosome-scale assembly of the tufted duck and the gene annotation using both Illumina and PacBio sequencing. The authors present and compare the resulting annotations from the different sequencing platforms which is useful for researchers intending to do RNAseq for annotation.

      I think the details in the note focuses primarily on the gene annotation comparison and the genome assembly has not received adequate attention. Since this will likely be the data note that reports on the genome assembly, it should probably have more details on the chromosomes (in detail below). I understand that one can go into an exhaustive description but I think these as a minimum will give the reader a good idea about the quality of the genome assembly:

      Are the of 34 autosomes + sex chromosomes expected? Was there an a priori expectation based on the karyotype or based on the mallard genome assembly? Was this expectation provided during the scaffolding using HiC? Does the assembly size match the expected genome size based on an independent estimate? Since this is a chromosome-scale assembly, what are the metrics of individual chromosomes? I see in NCBI that the chromosome numbers have been assigned, is this based on size or homology to chicken chromosomes? What are the lengths of each chromosome? GC content? Gene content? Gaps? How many contigs were scaffolded by HiC to build each chromosome? More detail is needed regarding how the manual curation is done so it can be repeatable by other researchers. What is the sequencing effort (# lanes, # SMRT cells, etc.) and resulting coverage from each technology of the genome? This doesn't have to be very detailed if it is reported elsewhere but some idea for the reader will be helpful. I am aware that the VGP pipeline was used for the assembly, is there a GitHub for the pipeline that can be included which has the specific commands and flags for each step? Since the annotation comparison was exhaustive, I don't have as many comments on it. The authors may not be explicitly making a recommendation on which approach to use but what is a good metric that the readers can use to compare the results considering the orders of magnitude difference in sequencing coverage?

      Regarding the Illumina and PacBio annotation comparisons, since the coverage are substantially different, in what metric are they comparable? Is it similar in sequencing cost? Would the PacBio still underperform in terms of recovered genes if it had the same coverage as the Illumina libraries? What was the sequencing effort for the PacBio IsoSeq?

    1. Abstract

      This paper has been published by GigaByte, which openly shares its peer reviews under a CC-BY4.0 licence.

      Reviewer 1. Qiye Li Are all data available and do they match the descriptions in the paper? No The available of all raw sequencing data generated in this study are not stated. And it would be appreciated if the authors could provide a table summarizing all the sequencing data generated in this study.

      Is the data acquisition clear, complete and methodologically sound? Yes Could you also provide the gender information for woy03?

      Is there sufficient detail in the methods and data-processing steps to allow reproduction? No

      L145-146: It is unclear how the authors determined full-length protein-coding genes by BLAST against the Swiss-Prot non-redundant database. It would be appreciated if the authors could provide more details here.

      L183: The authors indicated that 15,904 of the 24,655 protein-coding genes were supported by mRNA evidence and 1,309 by protein evidence. Does the mRNA evidence come from the RNA-seq data? Where does the protein evidence come from?

      Is there sufficient data validation and statistical analyses of data quality? No

      L233: Contaminating sequences in the reference genome are noteworthy, as the DNA for genome sequencing was extracted from wild animals that were dead before sampling. However, I would say high mapping rates did not necessarily represent low contaminating DNA, as the contaminating DNA (e.g. from bacteria), if exists in your dataset, might have been assembled as part of the woylie reference genome. It is unclear if the authors have submitted the genome to NCBI. If so, I think they should have got a report about contamination from NCBI.

      It would be appreciated if the authors could provide some more statistics for protein-coding genes (e.g. Mean gene size, Mean exon number per gene, Mean exon length and Mean intron length) and compare these metrics to other marsupials. This will be helpful to judge the quality of the gene models.

      Is the validation suitable for this type of data? Yes

      Is there sufficient information for others to reuse this dataset or integrate it with other data? Yes

      Recommendation: Minor Revision

      Reviewer 2. Parwinder Kaur Well presented document with good data and analyses practices.

      Recommendation: Accept

      Reviewer 3: Walter Wolfsberger Is there sufficient information for others to reuse this dataset or integrate it with other data? Yes

      The submission body and the table 1 state the following assembly stats of the genome assembly that seem to indicate some potential issues: Genome size (Gb) 3.39 No. scaffolds 1,116 No contigs 3,016 Scaffold N50 (Mb) - 6.94 Contig N50 (Mb)- 1.99

      The main issue here for me lies in Scaffold N50 in relation to other parameters, when in comparison with the assemblies using similar methodological approach.

      This can either be good or bad, as these numbers might indicate an issue during scaffolding, or presence of long top assembly scaffolds (which is great). I believe, that the submission would significantly benefit if this information is mentioned and discussed.

      The approach used to generate the assembly seems to utilize 10x PE sequences to scaffold the assembly. There are hybrid assembly approaches available, that leverage short reads to improve the assembly quality, given the slightly limited coverage of PacBio HiFi reads (approx. 12x).

      Recommendation: Minor Revision

      Re-review: The authors addressed all my assembly-related comments in sufficient manner and provided updates that will benefit the manuscript and data released with it.

  3. Nov 2021
    1. The mule deer

      Reviewer 2. Dr. Rebecca Taylor

      Are all data available and do they match the descriptions in the paper? Yes. I checked the two links included in text as well as the NCBI data availability and all data is available for download with good explanations as to what all the files are.

      Are the data and metadata consistent with relevant minimum information or reporting standards?<br> No. On the whole the information included for the data and metadata is good, but perhaps some more information about the sample used would be beneficial. It is stated that the sample came from 'Woodland Hills, Utah'. I assume this is in the United States? Some more information about the environment would be beneficial for those unfamiliar with the area. It is also not stated which subspecies the sample belongs to. A map of the species range and where this sample is from could also help. Additionally, in the 'Background and context' section, you state that 'genetic resources available for Odocoileus spp. are limited to a variety of microsatellite loci' with the exception from Russell et al. I think you need to do a more thorough search – for example I have used a sitka deer genome (Odocoileus hemionus sitkensis) as an outgroup for my work, sequenced by the CanSeq150 program, found on the NCBI under Bioproject PRJNA476345.

      Is the data acquisition clear, complete and methodologically sound? No. I was a bit confused by the sentence 'The assembled mule deer genome has a total length of 2.61 Gbp with a GC content of 41.8% and a contig N50 of 28.6 Mbp (Table 1) with a longest contig of roughly 96.5 Mbs' occurring before the 'Chromosome-length Scaffolding' section. Are these assembly statistics for the version of the genome before chromosome scaffolding? It would be better to report the final assembly statistics for the chromosome scale assembly, or if these are the final statistics then move those results until after the 'Chromosome-length Scaffolding' section. For Table 1, it might be nice to include the L/N90. The most standard are the N/L 50 and the N/L90, so you don’t necessarily need all of the others. Additionally, I could not find where it is stated how many 'chromosomes' in the final assembly. Is this a chromosome assembly or (more likely) a chromosome scale assembly (so there are also other scaffolds not included in the main ‘chromosomes’)? Relevant paper newly published might be good to reference: Yamaguchi et al. Technical considerations in Hi-C scaffolding and evaluation of chromosome-scale genome assemblies. Molecular Ecology. I also think it would be beneficial to know what the coverage of the files you used for the PSMC analysis were, making sure to filter out ~double the average (I can see that you filtered for a maximum of 90X but I don't know whether this is appropriate or not). Also a citation for the generation time used would be beneficial as this strongly influences PSMC results. It would also be good to explain the rise in effective pop size seen recently in the white tailed deer. Is this a real pattern or because PSMC can be spurious at more recent times? This is also a reason why it would be good to know the depth of both files used here. Could the different demographic histories be caused by competition between the species? I am not an expert on these species but I was just curious given their contrasting demographic histories.

      Is there sufficient data validation and statistical analyses of data quality? Not my area of expertise

      Is there sufficient information for others to reuse this dataset or integrate it with other data? No. As I stated above, information about which subspecies this individual is from in text would be beneficial.

      Recommendation: Minor revision

    2. ABSTRACT

      This paper has been published in GigaByte Journal (https://doi.org/10.46471/gigabyte.34), where it and the open peer reviews are published under a CC-BY 4.0 license.

      Reviewer 1. Dr.Endre Barta See comments in additional submitted file. Recommendation: Minor revision

    1. Mycobacterium avium subsp

      Reviewer 2. Dr. Nabeeh A. Hasan Is the language of sufficient quality? Yes. A few minor grammatical edits could be done.

      Are all data available and do they match the descriptions in the paper?<br> No. The data are not currently accessible by the public in NCBI.

      [The curators will be in touch to make sure all the data is live - see GigaDB guidelines on the data they require http://gigadb.org/site/guide]

    2. Abstract

      This article has been published open access in GigaByte Journal (https://doi.org/10.46471/gigabyte.33), which also publishes open peer reviews under a CC-BY4.0 license.

      Reviewer 1. Dr.Astrid Lewin

      Is there sufficient detail in the methods and data-processing steps to allow reproduction?

      Chapter „Methods, b) Bacterial isolation and DNA extraction”: Lines 139-140: There is a discrepancy between the method of DNA extraction as described in the reference (16) and the manuscript text. While according to the reference the bacterial pellet is dissolved in acetone, the manuscript text describes a treatment with chloroform and methanol. This should be clarified.

      Is there sufficient data validation and statistical analyses of data quality? Not my area of expertise

      Any Additional Overall Comments to the Author<br> Chapter „Methods, b) Bacterial isolation and DNA extraction”: Chapter “Data Validation and quality control, Identification of MAH” Lines 218-220: It is true, that the isolates had the highest identity with one of the three MAH strains, but not with all of the three MAH reference strains. For example, isolate OCU468 has 98.69% identity with MAH TH135 but 98.79% identity with MAP K-10. The degree of identity seems to be highly dependent on the choice of strains. Therefore, this comparison may not be very significant. In my experience, growth at 42°C very well distinguishes MAH from the other M. avium subspecies.

      Recommendation: Accept

    1. Abstract

      This paper has been published by GigaScience (doi:10.1093/gigascience/giab076) which publishes the peer reviews openly under a CC-BY license.

      Reviewer 1. Ayush Dogra An overview of the National COVID-19 Chest Imaging Database: data quality and cohort analysis Comments- 1) Abstract is not much convincing and informative. Please refine. 2) What is the motivation of this work? Please include in manuscript. 3) Author can provide more appealing block diagram for figure 1. 4) Inclusion Criteria section is bit ambiguous. How these certain criteria are decided? Justify. 5) How your manuscript is different from other manuscripts? Kindly include in manuscript. 6) Refine the discussion part. 7) There are few linguistic and grammatical errors. Please correct. 8) Similarity index must be less than 10 percent .

      Reviewer 2. Chris Armit This excellent Data Note provides an overview of the National COVID-19 Chest Imaging Database (NCCID), which is a centralised repository that hosts DICOM format radiological imaging data relating to COVID-19. By the very nature of this resource these data have immense reuse potential. The NCCID is the first national initiative of its kind - led by NHSX, British Society of Thoracic Imaging, and the Royal Surrey NHS Trust and Faculty - and the database hosts approximately 20,000 thoracic imaging studies related to SARS-CoV2 admissions from 20 NHS Hospitals / Trusts across England and Wales. Of note, the NCCID is additionally registered on the Health Data Research UK platform, with a platinum metadata rating which is a commendable achievement.

      As part of this review, I used the NCCID Data Access Agreement, NCCID Data Access Framework Contract, and NCCID Application Form to gain access to the NCCID Project WorkSpace. This WorkSpace utilises the very powerful and highly intuitive faculty.ai platform to run Jupyter Notebooks on a remote server where the NCCID data can be accessed. I was impressed that the faculty.ai platform allows very many different views of the NCCID data, for example one option was to view the data by Scanner Type. This is an important consideration from a deep learning reuse perspective as it is known that different X-ray / CT scanners can introduce different artefacts, and this can confound multisite analysis (for example see Badgeley et al., 2019, https://doi.org/10.1038/s41746-019-0105-1). I find that by NCCID organising the imaging data in this way particularly helpful for addressing this issue.

      I was additionally impressed that the NHS Analytics Unit was willing to provide an Onboarding Session to help a naïve user navigate the faculty.ai platform more effectively, and to provide one-on-one tuition on how the interface can be used for image analysis. I used this session to explore the functionality of the DICOM viewer that can be used to preview NCCID thoracic images. A Javascript viewer enables a user to open DICOM images and explore the image histogram of intensity values and I see this as a useful means of assessing, for example, contrast stretching in radiological image data that has been submitted to NCCID. As a follow-up to this Onboarding Session, there is now the additional option to launch a static viewer that offers a higher quality preview image of NCCID DICOM data. I find this functionality exceptionally helpful as it enables an end-user to preview image data and to visually inspect, for example, glassy nodules in COVID-19 thoracic image data prior to data download. I thank the NHS Analytics Unit for further developing the image visualisation capabilities of the NCCID Project WorkSpace as part of this review process. On this note I wish to highlight that, of the two viewers, I found the static viewer particularly helpful for assessing image quality of CT scans which was excellent.

      I was further impressed that the thoracic imaging data includes a positive cohort with COVID-19, but also a negative cohort consisting of individuals with a negative swab test, but who may have a different underlying respiratory condition. This is an important consideration and it enables this dataset to be used for machine learning and deep learning approaches that could be used to distinguish between COVID-19 and other respiratory conditions in what remains a clinically relevant challenge.

      Importantly, the code for the NCCID data warehouse and the Data Cleaning pipeline utilised in the paper are Open Source and available on GitHub (https://github.com/nhsx/covid-chest-imaging-database ; https://github.com/nhsx/nccid-cleaning) where they have been ascribed OSI-approved MIT licenses.

      This is an excellent Data Note and I recommend this manuscript for publication in GigaScience.

      Minor comments

      1. The MTA is tailored towards breast cancer screening. For example, there are the following definitions: "Source Database" means the assembled collection of images collated from the research project entitled 'OPTIMAM: Optimisation of breast cancer detection using digital X-ray technology'. "Related Data" means any and all pathological and clinical data associated with the Database Images supplied by or on behalf of CRT or Surrey to Company under this Agreement, in particular but without limitation, this may be identified regions of interest in the Database Images, the age of the woman at the date the relevant Database Image was taken, details about previous screening events, patient history, X-ray, ultrasound assessment, details of biopsy procedures and surgical events - all in a structured format representative in structure, format, quality, content and diversity of the Source Database.

      Can the authors please confirm that this MTA is suitable for thoracic radiology in the mixed sex COVID-19 study outlined in the accompanying preprint?

      1. In support of the manuscript, I further recommend that a copy of the NCCID Data Access Agreement, Data Access Framework Contract, Application Form, and snapshots of the code (GitHub archives) be archived in the GigaScience DataBase (GigaDB).
  4. Oct 2021
    1. Background

      Reviewer 2. Hugo Schnack This manuscript reports on the results of a study that can be split into two parts. For this, it should be noted that the authors consider three categories of quantities. The first category are the input data, or 'predictors': (a) variables derived from MRI scans and (b) rich sociodemographic variables. The second category, or 'target variables', as the authors call them, include: (a) age, (b) fluid intelligence and (c) neuroticism. In the first part of the study, using machine learning, predictive models are built to predict the target variables from the input variables. The resulting predictions are called 'proxy measures'. For the second stage, a third category of variables is included, the 'real world health behaviours', such as alcohol use and physical activity. The authors now set out to predict these measures of behaviour based on the measures of the second category, either the 'real ones' or the 'proxies'. Thus, the question is, can alcohol use be better predicted by neuroticism determined from a questionnaire, or by the neuroticism proxy derived from MRI and sociodemographics? The main results are presented in Figure 2, and the conclusion made by the authors is that the proxies perform better than the real measures.The authors carry out additional analyses, including the study of the relative importance of MRI and sociodemographics. The authors suggest that these proxies may have clinical use in the future. At first sight it may seem surprising that proxies perform better then the real measure in capturing the associations, but, as the authors mention, the real measures suffer from (measurement) noise and non-objectivity. However, the proxies are biased (in the sense of being to simple) and are thus less capable of modeling the (true) individual variation. I would have expected a more in depth discussion about this. Apart from this, there is an asymmetry in the way age is treated as compared to the other two target variables, intelligence and neuroticism. Age is a very hard measure, without any measurement error, and independent of the brain. The other two targets, intelligence and neuroticism, are softer measures, and directly related to the brain. How does this influence the analyses and the results? Indeed, not 'predicted age' is used as proxy, but 'brain age delta'. I would have liked to see more explanation and discussion about this. Finally, the suggested clinical use of the proxies is not supported well enough in my opinion. Maybe the authors could add more this discussion to this point as well. All in all, this is a scientifically interesting study, but I think the presentation could be improved, by more clearly stating the aims of it, and by giving more insight in certain aspects of the 'proxy modeling'.

    2. Abstract

      This paper has been published in GigaScience, where the peer reviews are published openly under a CC-BY 4.0 open license.

      Reviewer 1.Bo Cao Reviewer Comments to Author: The manuscript describes an application of Machine Learning (ML) models for the quantification of psychological constructs, e.g. fluid intelligence and neuroticism, using multi-mode MRI data from a large population cohort, the UK biobank data. They show that the proxy measures of these psychological constructs are more useful compared to the original constructs for characterizing health behaviors. Overall, the manuscript is well written. The research questions are clearly stated and are of practical importance. However, the reviewer has following concerns.

      Major Concerns:<br> 1) In page 3 (left, lines 3-6 of the main text), the author claims that "Our findings suggested that psychological constructs can be approximated from brain images and sociodemographic variables - inputs not tailored to specifically measure these constructs.". The reviewer has concerns about this claim. Although Figure 3 shows the model's performance in predicting age, fluid intelligence and neuroticism using neuroimaging data and different areas of sociodemographic data, the performance of the models in predicting the psychological constructs, fluid intelligences and neuroticism, may not be good enough to support such a claim. 2) In Figure 2, the proxy measure and original measure show similar associations with the health phenotypes for fluid intelligence (center plot) and neuroticism (right plot), but not for the brain age delta. The main reason seems to be when doing the association analysis, the measures of the health phenotypes are de-confounded for their dependence for age (In the subsection "Out-of-sample association between proxy measures and health-related habits" of the "statistical analysis" section). However, it seems the same procedure is not applied for the association analysis of fluid intelligence and neuroticism. The estimated brain age or brain age gap depends on the age. Thus, we need to either correct the brain age or brain age gap for its dependence on the age, or de-confounded the health phenotype's dependence on age. If the author wants to derive the proxy measure of the psychological construct in the same as the brain age (or biological age), same procedure should be used to correct the proxy measure's dependence on the original measure. 3) Based on Figure 2, the author claims that the proxy measures have enhanced association with health behavior compared to the original measures. If we only focus on the central and right part of the Figure 2, the difference is not that obvious. We do not know if the difference is significant or not. A better approach maybe is that correct the predicted fluid intelligence and predicted fluid intelligence for their dependence on the original measures or de-confounded the original measures' effects on the health behaviors.

      Minor concerns: 1) In page 1 (two lines before reference 15), it seems that "to learn" is mis-spelled into "tolearn".

      2) The author stated that there are repeated measures for subjects in UK biobank data. How the author tackles this issue in their data preprocessing? Using the last one or the first one or something else?

      3) The selection 5,587 out of all the 10,975 subjects for the modeling, while the left part is for the out-of-sample association analysis. The selection seems arbitrary. Can the author also show a learning curve, in which x is the sample size and y is the model's performance, to justify their choice is enough to train an accurate ML model?

      4) In the first paragraph of the "Methods" section, there are duplications.

      5) In the subsection of "Data acquisition" part, under the "target measures" paragraph, the age at the baseline recruitment is used as the outcome. However, in general, there is a gap between the age at baseline and the age when the MRI images were acquired. Does this matter for the data analysis in this manuscript.

      6) For the classification analysis (paragraph "Classification analysis" in the subsection of "Comparing predictive models to approximate target measures", and the paragraph above the "Discussion" section), the thresholds selected to discretize the outcome variables are kind of arbitrary.

      Comments on Re-Review: The substantial revision improved the paper and is appreciated by the reviewer. The details have been enhanced. However, the reviewer still has some concerns about the basic logic and its presentation of the paper after reviewing all the comments from other reviewers and the feedback from the author. Figure 1 is helpful (BTW, the font is too small and smaller than other figures). But if we consider the current approach again, when the machine learning (ML) has perfect performance to generate the so called "proxy measures", these measures should match exactly each individual's age, fluid-intelligence and neuroticism. What the author claimed about proxy measures providing better assessment to other health related variables might be simply due to the imperfectness or the "residuals" from ML prediction to the real targets (age, fluid-intelligence and neuroticism). The author may need to address this and present the logic of the paper in a clearer way to help the readers understand the main point and results of the paper. In this regard, Figure 1 is incomplete in addressing the full flow of the paper, which is necessary for such a seemingly complex paper in the reviewer's opinion.

    1. Abstract

      A revised and updated version adapted from this preprint was published on 6th September 2018 in GigaScience called:

      Sequana coverage: detection and characterization of genomic variations using running median and mixture models https://doi.org/10.1093/gigascience/giy110

      As an open access, open peer review journal the peer reviews of this paper are available here:

      Review 1. http://dx.doi.org/10.5524/REVIEW.101353 Review 2. http://dx.doi.org/10.5524/REVIEW.101350 Review 3. http://dx.doi.org/10.5524/REVIEW.101351

    1. The Bicolor Angelfish

      Reviewer 2. Ole K. Tørresen Is the language of sufficient quality?<br> No. Almost every second sentence in the abstract would need work, and so is the rest of the manuscript.

      General comments: The authors have created a chromosome-level genome assembly of bicolor angelfish using stLFR and HiC libraries.

      The language in this manuscript needs some work. After commenting on every second sentence in the abstract regarding some language matter, I saw that I couldn’t continue commenting all these matters. Please do a good clean-up in the language, so that it is easier to read. I’ll point out some issues during the manuscript, but will not find all and I can’t manage to point out all I do find.

      Specific comments: Line 19: «...special and beautiful two-color body” is a bit subjective. Maybe something like “…remarkable and striking two-color body” instead?

      Line 20: I know this is the abstract, but I don’t understand what “the mechanism of bicolor body” could mean. Maybe rephrase?

      Line 22: I’ve seen this many places, but it should be a lower-case k in kb, not upper case like Kb. The k stands for kilo which is a metric prefix meaning thousand (https://en.wikipedia.org/wiki/Metric_prefix).

      Line 25: “As we are known,” should be “as far as we know”.

      Line 27: “Future research” instead of “future researches”.

      Line 46: Which protocol are you talking about?

      Table 1: How can you end up with more “valid data” than “raw data”? Did you mix up something here? It looks consistent with the text, but there’s likely something wrong.

      Recommendation Minor Revision

    2. Abstract

      This paper has been reviewed by GigaByte Journal and all peer reviews are shared CC-BY 4.0.

      Reviewer 1. Claudius F Kratochwil. Is the language of sufficient quality?<br> No. The text is understandable, but has many grammatical errors. The manuscript would greatly improve through language editing.

      Are all data available and do they match the descriptions in the paper?<br> Yes. I did not check every single file, but all data I looked for I found to be publicly available. It would help if the "Availability of supporting data and materials" statement would be a bit more comprehensive. Data A is deposited under X, Data B is deposited under Y-Z instead of just providing the project ID.

      Are the data and metadata consistent with relevant minimum information or reporting standards? Yes. To the best of my knowledge. Not my area of expertise.

      Is the data acquisition clear, complete and methodologically sound?<br> No. I was lacking information about the transcriptomic data (it says in line 44 that RNA was extracted) that was used for the annotation? Was RNA only extracted from the muscle? Maybe the caveats that go along with that should be discussed. How was the data processed? How many reads etc. I think the manuscript lacks information about this unless I misunderstood where the data for the "transcript-based prediction" came from. Then this should be indicated more clearly.

      Is there sufficient detail in the methods and data-processing steps to allow reproduction?<br> Yes. To the best of my knowledge. I am not an expert on this. Minor comments: l 46: Which protocol?

      Is there sufficient data validation and statistical analyses of data quality?<br> Not my area of expertise. One thing that could be probably additionally done is to provide dot plots with the 1-2 more closely related species with chromosome level assemblies (probably Tilapia or Medaka).

      Is the validation suitable for this type of data?<br> Yes. As far as I can judge the analysis is fine.

      Is there sufficient information for others to reuse this dataset or integrate it with other data?<br> Yes. Genome and annotation are available, which is the most important for reuse and integration with other data sets. So as far as I can judge there is sufficient information for others.

      Any Additional Overall Comments to the Author:<br> From my viewpoint, this is a useful chromosome-level genome, so I support its publication. Beyond being a useful resource, I was however a bit disappointed by the 'scientific part' regarding the bi-color body formation. While the pigmentation of the bicolor angelfish is certainly a very exciting phenotype, the analysis performed is far too superficial to give any solid insights into the phenotype. I would suggest the authors toning this down in title, abstract and main text. It is fine to mention this as a future research direction and to state that the performed initial analysis (fig. 6 and 7) might aid these investigations, but the data does not permit further conclusions. Especially as GigaByte does not focus on analyses for biological findings, this should be completely sufficient.

      Recommendation: Major Revision

      Re-Review: I am happy with the changes made and thank the authors for addressing them. The manuscript is in my opinion acceptable for publication. Congratulations to the authors for providing a reference genome for this exciting fish species to the community.

    1. Horsegram

      Reviewer 2. Penghao Wang Authors presented a paper on describing a new pseudo-chromosome draft genome sequences of a legume plant horsegram and some bioinformatics analyses based on the data. The presented assembly is of good quality and the bioinformatics analysis performed is sound. The resources made available by the study should prove valuable to researchers working on the plant and legume community on a whole. The paper is generally well written and I personally found out the paper is quite easy to follow. A few grammatical errors can be found. The bioinformatics methodology that has been utilised in the study is sound and the software used fit the goals of the study. However, authors need to present more details on some analysis components, e.g. the parameter set used for the software, the version of the software, the OS, etc, so that the analysis can be better reproduced. For example, in Methods section, line 76 the Jellyfish program was used to estimate the genome size, the parameter, version, OS of running the software were not mentioned. Line 78 SOAPdenovo2 apart from Kmer the most important parameter, what about the rest? SSPACE 2.0 was used for scaffolding, the insert sizes? Platanus, MaSuRCA, TruSPAdes, RepeatMasker, augustus, all these software involve a number of parameters, and the details on how they were used need to be provided. Because the results can be sharply different with different parameters. Some figures appear to be created by using some tools, and these tools need to be acknowledged and referenced. For example, is Circus used to generate the circular plot in Fig 5? In addition, I could not find captions for all the main figures.

      Recommendation: Minor Revision

    2. Summary

      Reviewer 1. Tianzuo Wang Is the language of sufficient quality? It can be improved better.

      Shirasawa et al. reported a Chromosome-scale draft genome sequence of horsegram, and performed the analysis of comparative genomics. 1.If Pacbio data was used, the quality of genome can be improved much. 2.Only genomes of P. vulgaris, V. angularis and L. japonicus in the legume were used for phylogenetic analysis. Soybean and Medicago, as the model legume plants, should be added at last. 3.The section of Whole genome structure in horsegram should be introduced before Diversity analysis in genetic resources, Genes related to drought tolerance, and Transcript sequencing, gene prediction and annotation. Because genome information is the foundation of other analysis.

      Recommendation Major Revision

    1. Background

      Reviewer 2. Alun Li.

      Is the language of sufficient quality? Yes

      Is there a clear statement of need explaining what problems the software is designed to solve and who the target audience is? Yes

      Is the source code available, and has an appropriate Open Source Initiative license (https://opensource.org/licenses) been assigned to the code? No Additional Comments There is no license in the github repository.

      As Open Source Software are there guidelines on how to contribute, report issues or seek support on the code? Yes. Github can be used to report issues or seek support on the code

      Is the code executable? Yes

      Is installation/deployment sufficiently outlined in the paper and documentation, and does it proceed as outlined? Yes

      Is the documentation provided clear and user friendly? Yes

      Is there a clearly-stated list of dependencies, and is the core functionality of the software documented to a satisfactory level? Yes

      Have any claims of performance been sufficiently tested and compared to other commonly-used packages? Yes

      Are there (ideally real world) examples demonstrating use of the software? Yes

      Is automated testing used or are there manual steps described so that the functionality of the software can be verified? Yes

      Additional Comments<br> Any Additional Overall Comments to the Author The paper describes an ultra-fast and accurate trimmer for adapter and quality trimming: Atria and compare it to several published tools. The tool is demonstrated to work on sequencing data with competitive accuracy and efficiency compared with existing tools.

      There are concerns that should be addressed: 1. The performance comparisons listed in Table 2 show that Atria is not extremely impressive compared with existing tools with quality trimming in percentage of the properly paired reads and the number of unmapped reads. Also, there are no more features than existing tools like Fastp, which may limit the widespread use of this software. 2. IO could be the main bottleneck for most hard-disk drivers when performing adapter trimming for compressed input/output files. So, the wall time to run different tools is also a good measurement. I wonder whether there is a significant advantage in performance if the runtime benchmark is measured by wall time. 3. Can the algorithm deal with different lengths of adapter sequences? It would be good to test out the performance of the tools with increasing length of adapter sequence. 4. L79 states that Atria is compatible with single-end data from Pacbio and Nanopore platforms, but there is no corresponding data in the paper to support the statement. Besides, the limitations of the byte-based matching algorithm make it difficult to deal with Pacbio and Nanopore sequences with high insert and deletion rates. It is necessary to describe how to get rid of these limitations in sufficient detail if they have been overcome. 5. It may be better if the description of this algorithm is presented in pseudocode especially in the section of “Matching and scoring” and “Decision rules”. 6. L165-L168, I don't quite understand why an adapter is an ideal adapter when the matching score is bigger than 10? Also, why the read pair will not be trimmed when the matching score is less than 19? Are there any reasons for the authors to set these two parameters 10 and 19 respectively? In addition, it is necessary for the authors to demonstrate that the program is robust enough for different lengths of adapter sequences. 7. All symbols in the paper should be clearly identified, e.g., L115 a1, L121 8. L135,” Because the matching algorithm requires much less time, we implement four pairs of matching to utilize properties of paired-end reads thoroughly”. The causation here does not hold.

      Recommendation Minor Revisions

    2. Abstract

      Reviewer 1. Xingyu Liao

      Is the language of sufficient quality? Yes

      Is there a clear statement of need explaining what problems the software is designed to solve and who the target audience is? Yes

      Is the source code available, and has an appropriate Open Source Initiative license (https://opensource.org/licenses) been assigned to the code? Yes

      As Open Source Software are there guidelines on how to contribute, report issues or seek support on the code? Yes

      Is the code executable? Unable to test

      Is installation/deployment sufficiently outlined in the paper and documentation, and does it proceed as outlined? Yes

      Is the documentation provided clear and user friendly? Yes

      Is there a clearly-stated list of dependencies, and is the core functionality of the software documented to a satisfactory level? Yes

      Have any claims of performance been sufficiently tested and compared to other commonly-used packages? No

      Are there (ideally real world) examples demonstrating use of the software? No

      Is automated testing used or are there manual steps described so that the functionality of the software can be verified? Yes

      Additional Comments

      Opinion: Author Should Prepare a Major Revision.

      In this paper, the authors proposed a trimming algorithm called Atria, which matches the adapters in paired-end reads and finds possible overlapped regions with a super-fast and carefully designed byte-based matching algorithm. Furthermore, Atria implements multi-threading in both sequence processing and file compression and support single-end reads. The proposed algorithm has some significance in both theory and practical application. However, I still have some questions to discuss with authors. The comments on the paper are as follows. (1) Major Comments: 1) The author highlights the fast and accurate characteristics of the proposed trimming algorithm in the title of the manuscript. However, the large amount of content in the manuscript and supplementary is to prove the advantages of the proposed algorithm in terms of speed, processing efficient, and utilization of CPU and RAM. The assessment of trimming accuracy is very limited, and it seems that only general statistics are given in Table 2 of the manuscript. I personally think that the alignment rate of reads (or the number of paired-end reads) before and after trimming is not a good proof of the accuracy of the trimming algorithm. What's more, judging from the experimental results in Table 2, the Atria algorithm does not have much advantage in accuracy compared to other methods. As the author stated in the abstract, sequence trimming is of great significance for SNP detection and sequence assembly. I very much hope to see Atria's optimization and promotion of these applications. 2) The datasets used in this study seem to be unrepresentative, and most of them can be trimmed within a few to ten seconds. The difference between a few seconds and a dozen seconds, I think most users will not care. To prove the significant advantages of the proposed algorithm in terms of efficiency, some large-scale datasets (such as several samples sequenced in the 1000 genome project) should be used. (2) Minor Comments: 1) The table2 display of line 562 is incomplete.

  5. Sep 2021
    1. Abstract

      Reviewer 1. Joon-Ho Yu Thank you for the opportunity to review this manuscript. Overall, I appreciate this argument for and description of Open Humans.  Broadly, the manuscript would benefit from greater attention to writing and organization. As my comments describe below, the "ethical analysis" offered is narrowly focused and appears to serve as a justification for the resource; yet, in its current state, I think the ethical analysis either should be removed or expanded. Ideally, the manuscript would be strengthened by a deepening and broadening of ethical considerations. Note that I use P(page)C(column)L(lines) to locate my comments for the authors.

      1. Abstract P1L36-37.  I am struck by the framing of this ethical problem as the responsibility of data subjects.  I assume this is intentional and would appreciate a little more, perhaps in the introduction, as to what is entailed in this responsibility?
      2. Abstract P1L42-43. I am not sure if the framing of the ethical problem is resolved by the description of the utility of Open Humans.  While overall, I suggest deepening the ethical problems presented, another alternative is to leave it out all together.
      3. P2C2L6-9. It would help me if parties were more clearly stated.  I think you mean researchers not research and it isn't clear to me that commercial data sources have interests but rather the companies that hold these resources do, right?
      4. P2C2 Participant Involvement.  It is unclear to me what the purpose of this section is.
      5. P2C1 Data Silos. Most of the descriptive language is written in the passive voice which I understand may be the norm but in my opinion, it unintentionally highlights how interests and responsibilities are dissociated or dis-located from stakeholders.  For instance, in the section on Data Silos, it remains unclear for whom Data Silos are a problem and whose interests have created and maintained these silos.  Again, this sort of analysis might help identify or locate solutions rather than only set up a problem that Open Human's solves.  My point here is that the developers of Open Humans need not rely on a somewhat limited ethical analysis to justify its existence and argue for its utility.
      6. P2C1L44-49. While I agree this is accurate reflection of the scope of literature, the issues raised by "big data" research now extend far beyond the common risks relayed in a consent process.
      7. P2C1L49-51. I agree that this is an important issue but this single statement citing Barbara Evans sounds a little like a strawman.  My sense is that through the efforts of many patient-driven organizations, patient and participant-driven research has increased a great deal in the past decade or so.  Perhaps this ought to be recognized especially given that many of the authors have been critical to the development of this movement.  Also, the next section on participant involvement seems at odds with the argument so some clarification might help readers understand the nuances.
      8. P2C2L53-61.  While I totally agree and appreciate these key points to the participant-centered approach to research, in all honesty, I did not come to these conclusions based on the above exposition.  I suggest moving this up as the scaffold for the introduction and reorganize based on these points.
      9. P3C1L30-36. These are the main points I think readers need in the introduction to help us understand the need for Open Humans.  I suggest you spend more time explaining these points and characterizing the evidence of these important assertions.
      10. P3C2L46-50. Could you explain the rationale behind this feature and briefly describe if more detailed information is conveyed about the IRB approval or review/determination?
      11. P4C2L25-27. This is an important statement, at least to me, but it would be helpful to reiterate how privacy is maintained, I'm assuming because its pseudonymous?
      12. P4C2L27-30. Again, what are the simple requirements?
      13. P5C1L58-C2L59. So what are the ethical implications of this use case?  I think an important point to highlight is that privacy may be a nominal issue with members of efforts like Open Humans as they often have a greater than average interest in research benefits than maintaining individual privacy. Further, I'm under the impression that personal privacy is less of a concern for many or rather our sense of what is private is changing.  Assuming I'm understanding the argument, what I'm confused about is that the ethical analysis presented in the background assumes that privacy is of central perhaps even sole concern.  Also, there are many other ethical issues that open humans both addresses possibly in a positive way and potentially raises as risks to members and even society.  So, I would welcome that analysis alongside this nice introduction to the platform or I would not rest the argument for the platform on a relatively narrow ethical frame.
      14. P6C2L16-21. Do you mean the public data are being used as training sets for the algorithms?  Are there any risks of bias based on these sorts of uses?
      15. P6C1L44-45. So are there any ethical issues related to the application of OAuth2 to these particular use cases or overall?  This isn't a trick question, I have no idea but would encourage the authors to consider based on their expertise.
      16. P7C2L9-11. Agreed, but does it also make it harder for bad actors to use these data?  It would be great if the authors could help us think about this potential trade off.
      17. P7C1 Discussion. I would like the authors to consider the following in the discussion and possibly the introduction. (1) Given that most people who engage in citizen science in the biomedical research space are likely to subscribe to the value of openness and sharing of samples, data, tools, etc., I wonder if focusing on privacy as key ethical barrier is on target and sufficient.  For instance, many of the challenges to genomic research  articulated by historically vulnerable populations have to do with offensive data uses, lack of control, lack of direct benefit, differential benefit based on SES, risks to groups, etc.  Again, a critical analysis of how this resource might increase or decrease such risks involved in citizen science would contribute to the larger project of extending citizen science or patient-led research to community-led research.  Of course, I understand this might been outside the bounds of this manuscript but that preclude some consideration. (2) I very much appreciate Open Humans as a tool that addresses the practical problem of bridging/linking/aggregating.  I have no problems with this argument yet I wonder if it is somewhat naive to assume that bridging as a practical benefit does not also risk other ethical challenges.  For example, the ease of bridging to pre-selected resources blurs the line between simply linking resources and advancing particular interpretations of the data, in fact, one's own data.  If I understand Open Humans, it is a tool that automates protocols for linking and sharing intended to facilitate citizen science and patient-led research.  The practical benefits are clear. But what are the risks associated with more automated linking and sharing?
      18. P7C2 Enabling individual-centric research and citizen science. This section is very helpful and references a number of mechanisms that begin to address, at least on an individual level, issues such as "to what uses", "control", "governance", etc.  I would love to either see this description expanded and moved up into the initial description of the resource (maybe before or around P2C2L57) and or these functional benefits better incorporated and explicated in the use cases.
      19. P8C1L13-16. It is unclear to me how it is "an ethical way" especially as it isn't clear to me what an "unethical way" would entail.   I think some pieces are presented but this argument could be much stronger and clearer.  I get that the benefits are assumed here to some extent, I've been in the same place when engaging in resource development, but perhaps a greater consideration of potential benefits and harms might help balance the focus on privacy and individual control.  Generally when we conduct ethical analysis we consider autonomy (where privacy sits), risks (as potential harms as well as increasingly benefits), and justice.  Notably. others might argue for other principles and values.  While such a comprehensive analysis isn't the focus of this manuscript, incorporating the insights of such an analysis would, in my opinion, strengthen the argument for Open Humans and signal/evidence robust consideration by its designers and authors.
    2. Background

      Reviewer 2. Birgit Wouters.

      In this paper, the authors have presented an innovative solution to the complex and multi-faceted problem of sharing personal (health) data. Open Humans, a community-based platform, serves multiple aims: (1) to be ethically justifiable: a. by focusing upon granular, individual consent for each single project, thereby avoiding the issue of compatible purposes for secondary/tertiary/... processing; b. by putting individuals in control of their personal dataset; and c. by involving them in the governance of the ecosystem; (2) to enable both academic and citizen-led research; and (3) to break open existing data silos and allow for the merging of datasets. Serving these aims simultaneously is undoubtedly ambitious. Yet, the authors have demonstrated how Open Humans is designed to do just that. The community-based platform has clearly been carefully designed, and the presentation of the design and the use cases is clear, well-written and easy to follow. Whilst Open Humans is an interesting and promising project, my comments center around the ethical justifiability of this community-based platform. Further clarification and/or elaboration on these comments is strongly recommended. One important goal of Open Humans is for research to be driven by the individuals the data come from by putting them into control of their data. The level of control is described as 'full control'. In addition, putting the participant into control of their data is regarded as important taking into account the more sensitive context of precision medicine. Under "Data Silos", the authors also mention that, next to other legislation, the General Data Protection Regulation is applicable and that the right of data portability has the potential to break open these silos. My main critique is that the article takes into account insufficiently the particularities of the General Data Protection Regulation. WHAT CONSTITUTES CONTROL? Firstly, under the General Data Protection Regulation, the individual has the following rights: right to be informed, right of access, right to rectification, right to be forgotten, right to restriction of processing, right to data portability, the right to object and, albeit less relevant in this context, rights in relation to automated decision-making. Yet, in relation to scientific research, most Member States of the European Union allow for the right of access, the right to rectification, and the right to restriction of processing to be denied. The article very briefly mentions data access, within the context of human subjects research, to be recommended but not legally required. However, it does not make mention of the other two deniable rights (right to rectification + right to restrict processing). It leads to the first main question: what exactly constitutes control? How does Open Humans define control? The article mentions and describes a granular consent and privacy model. However, consent is important, but merely a legal basis for processing. How does Open Humans guarantee the other individual rights as granted by the General Data Protection Regulation? The right to information is shortly described on page 7, and so is the right of data portability, but, if full control is the desirable route, it means guaranteeing all rights granted. However, in the context of reproducibility of scientific research, granting all rights does not seem feasible. In particular, the right of rectification and the right to restrict processing seem problematic. Further clarification/elaboration on this issue is required. Is full control the route Open Humans wants to take, or is Open Humans implementing a limited control for the individual? Apart from granular consent, what other forms of control does Open Humans offer? GRANULAR CONSENT IS DIFFERENT FROM SPECIFIC CONSENT The GDPR requires consent to be freely given, specific, informed and unambiguous (see article 7 and recital 32). Granular consent is needed when one service is involved with multiple processing operations for multiple purposes. In such a case, consent is required for every purpose of processing. This is referred to as granular consent. Whilst closely related, granular consent is therefore different from specific consent. However, in the context of Open Humans, it is doubtful that a situation will arise where one research project will process data for more than one purpose, and thus require granular consent. Research projects work on the basis of a specific research question and/or purpose. RIGHT TO DATA PORTABILITY IS LIMITED TO DATA PROVIDED BY THE INDIVIDUAL The right to data portability is regarded to have the potential to boost the adoption of a system where individuals can recollect and integrate their personal data from different sources, 'as it guarantees individuals in the European Union a right to export their personal data in electronic and other useful formats'. However, Article 20 of the GDPR limits the right to data portability to the personal data that the data subject himself/herself has provided to the controller. Data provided by the data controller do not fall under the scope of the right to data portability. The argument that the right to data portability can lead to the breaking up from the different data silos is therefore less convincing.

    1. This article is a preprint and has not been certified by peer review [what does this mean?]. John M. Sutton 1Department of Biological Sciences, The University of Alabama, Tuscaloosa, AL 35487-0344Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for John M. SuttonJoshua D. Millwood 1Department of Biological Sciences, The University of Alabama, Tuscaloosa, AL 35487-0344Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteA. Case McCormack 1Department of Biological Sciences, The University of Alabama, Tuscaloosa, AL 35487-0344Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteJanna L. Fierst 1Department of Biological Sciences, The University of Alabama, Tuscaloosa, AL 35487-0344Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Janna L. FierstFor correspondence: janna.l.fierst@ua.edu

      This work has been peer reviewed in GigaByte (https://doi.org/10.46471/gigabyte.27), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 1. Zhao Chen The authors should clarify why only Canu and Flye were selected instead of other long-read assemblers such as Raven, Redbean, Shasta, and Miniasm. Rationales should be given for why these two assemblers were selected. The same thing for MaSuRCA. It looks like you used MaSuRCA for hybrid assembly. Unicycler also contains a commonly used hybrid assembly pipeline. Therefore, you should also explain why MaSuRCA was selected for your study. A flow chart with all bioinformatics tools included should be provided to show more clearly how this entire study was carried out, including assembly, error correction, and analysis after assembly. More information about the quality of long reads should be provided, such as Phred quality scores, percentage of reads with Q30 or above, and average read lengths. QUAST should suffice for these quality analyses. Only testing simulated reads is not sufficient for making a solid conclusion since simulated reads cannot be treated as being equal to real reads or reflect basecalling errors in real reads. Since real reads are readily available on NCBI, real reads should also be tested. As your title didn’t mention anything about the fact that this study was solely based on testing simulated reads and your objective was to optimize the bioinformatic pipeline for processing Oxford Nanopore long reads, the experiments should be performed by including all conditions. Accordingly, real reads should also be tested, which could significantly improve the scientific quality of this study. Line12-13: This may not be true, since many studies have been published on how to assemble and error-correct Oxford Nanopore long reads to produce accurate genomes. The authors should describe why the present study is novel and what new findings were reported.

      Recommendation: Major Revisions

      Reviewer 2. Shanlin Liu The genome de novo assembly based on third generation sequencing (ONT in the current work) has been widely applied for plenty of organisms, including bacteria, plants and animals with various genome sizes, e g. the two recently published lungfish genomes (genome size of > 30 G) in Nature and Cell, and genomes of a broad range of species published in GigaScience, Scientific Data, Molecular Ecology Resources, et al. It is pretty easy to find the analysis pipeline or datasets that were used to obtain a high quality genome assembly in those published works. The authors generated multiple genome assemblies for four model species using different simulated datasets with varied sequencing depths and different assembly tools, and tried to provide useful guidance for those who are new to genome assembly. However, I am afraid that the current study contains some limitations in the results and conclusions that may mislead the readers, and I do recommend the authors reconstruct the manuscript and address those issues before its publication. First of all, a routine practice of genome assembly with long reads (either ONT or PacBio) includes a polishing step based on long reads itself using tools like Nanopolish, Medaka, Racon, et al. The author skipped this step in all of their analyses and directly evaluated the assembly errors based on the outputs generated from different combinations of datasets and software. It has little practical value whatever the results showed. Secondly, the four model species included in the current work can hardly represent a broad range of organisms – all have a genome size < 200 MB and low level of repetitive elements (< 30%). Hence, the analysis results from the current work offer scant guidance to those who work on organisms like plants, fishes, insects, mammals et al. For example, computing resources become the first hurdle for the genome assembly when working on > 100X ONT reads for the species with large genome size even if you can afford the sequencing. So, researchers would generate less data or prefer assemblers like WTDBG, NextDenovo, Falcon to obtain their genome assembly. In addition, the authors deem Caenorhabditis species as a highly heterozygous genome (0.7% according to their calculation), which is also open to question. Genomes of multitudinous insects and plants have a much higher heterozygous level. What’s more, the authors may want to pay attention to the news regarding the Sequel II sequencing platform recently released by PacBio Tech. As far as I know, it can provide inexpensive long read sequencing thanks to its huge improvement in sequencing throughput. Also, it also has a new release of a library preparation kit that can work on low amounts of DNA inputs. If so, what you stated in the instruction section may be incorrect. Beside the major issues mentioned above, there are some other minor ones listing as follows: Line 89. The authors may want to provide common names of those model species to improve readability of the manuscript. Line 119 Genome references and ONT reads were derived from different individuals or strains, and there are very low coverage ONT reads for E. coli. I am not sure whether those factors will influence the quality of simulation or not. The authors may add a caution to clarify these concerns.

      Line 24 A combination of experimental techniques? It is better to specify what experimental techniques. Line 128 Incorrect word format and C. latens missed. Line 141 How to define the best performance, the most contiguous assembly? Line 137 When you say “failed to produce an assembly”, does it mean that software failed to generate outputs or unexpected assembly results? Line 287 Supplement the BUSCO value of the reference TAIR10 Line 287 what do you mean by “combined approach”? Do you mean the method that corrects reads using Canu and assembles them using FLYE? Line 233 – 241 the “corrected” and “selected” dataset used in the Nematoda test were not applied to other organisms. Line 241 Canu correction could truncate some low quality reads or cut long reads into multiple pieces for speculated chimeric reads. I don’t think you can reach a conclusion that read length influences assembly quality using the current results. Line 242 Please rephrase this sentence and put Figure 5 and reference #36 in better positions to avoid misunderstanding. Line 337 – 341 duplicates to the content line 308 – 312, and conflicts between each other. Line 355 All the tested organisms have genome sizes < 200 MB, please specific this limitation instead of saying a broad range of organisms. Line 368 Low coverage may mislead readers, the authors cannot reach such a conclusion based on merely one single test. Line 461 which model was used – high accuracy? or flip-flop? Table 1. Too long a header, could move some of the content as table notes.

      Recommendation: Major Revisions

    1. This article is a preprint and has not been certified by peer review [what does this mean?]. Sherry Miller 1Division of Biology, Kansas State University, Manhattan, KS 665062Allen County Community College, Burlingame, KS 66413Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteTeresa D. Shippy 1Division of Biology, Kansas State University, Manhattan, KS 66506Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Teresa D. ShippyPrashant S Hosmani 3Boyce Thompson Institute, Ithaca, NY 14853Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Prashant S HosmaniMirella Flores-Gonzalez 3Boyce Thompson Institute, Ithaca, NY 14853Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Mirella Flores-GonzalezWayne B Hunter 4USDA-ARS, U.S. Horticultural Research Laboratory, Fort Pierce, FL 34945Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Wayne B HunterSusan J Brown 1Division of Biology, Kansas State University, Manhattan, KS 66506Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Susan J BrownTom D’elia 5Indian River State College, Fort Pierce, FL 34981Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Tom D’eliaSurya Saha 3Boyce Thompson Institute, Ithaca, NY 148536Animal and Comparative Biomedical Sciences, University of Arizona, Tucson, AZ 85721Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Surya SahaFor correspondence: ss2489@cornell.edu

      This work has been peer reviewed in GigaByte (https://doi.org/10.46471/gigabyte.26), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 1. Hailin LiuIt seems there are no very sound biological values in this manuscript, and more validation or comparative study are suggested to mine more meaningful conclusions.

      Reviewer 2. Mary Ann Tuli Please add additional comments on language quality to clarify if needed : The manuscript reads very well.

      Are all data available and do they match the descriptions in the paper?<br> No. Additional Comments: 1) Line 206. "Multiple alignments were performed with MUSCLE or MEGA7 " (figure 1). We need the output of MUSCLE (FASTA). We need the output of MEGA7 (FASTA)

      2) I note that MEGA7 has been used. I wonder why the newer release (MEGAX, March '21) was not used. Furthermore, the annotation protocol (dx.doi.org/10.17504/protocols.io.bniimcce) suggests using Mega7 or MegaX.

      3) Line 207. "phylogenetic analysis was done in MEGA7 or MEGA X" (figure 2). We need the files underlying the phylogenetic tree (newick) (figure 2).

      Are the data and metadata consistent with relevant minimum information or reporting standards? Yes. Nomenclature standards have been met. All cited INSDC accession numbers are publicly available.

      Is the data acquisition clear, complete and methodologically sound?<br> Yes. Curation workflow used for community annotation is available via protocols.io , nonetheless the manuscript includes comprehensive summary which is appropriate.

      Is there sufficient detail in the methods and data-processing steps to allow reproduction?<br> No. See "Are all data available and do they match the descriptions in the paper?" above. Once the additional files are made available I believe reproduction will be possible.

      Is there sufficient data validation and statistical analyses of data quality? Yes

      Is the validation suitable for this type of data? Yes

      Is there sufficient information for others to reuse this dataset or integrate it with other data? Yes

      Any Additional Overall Comments to the Author<br> Some of my comments/recommendations are pertinent to the other D. citri manuscripts currently under review.

    1. Now published in Gigabyte doi: 10.46471/gigabyte.25

      This work has been peer reviewed in GigaByte (https://doi.org/10.46471/gigabyte.25), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 1. Xingtan Zhang and Yingying Gao

      The manuscript entitled “Characterization of chitin deacetylase genes in the Diaphorina citri genome” has an interesting subject. However, general organization and content of the manuscript is not well developed and needs extensive revision. It seems to me that the introduction does not offer a good amount of useful information or background knowledge for its reader. Some sentence is confused (as example see line 30).

      The topic " Materials and Methods" Lines 137-139. For the software, which one was used or both as there is only one tree plot? Why these species were chosen?

      Some useful experiments targeting the genes need to be conducted further. The data from RNA-seq should be confirmed by other experimental methods to support your result or conclusion. The results and conclusion are less informatic, and discussion is missing. The manuscript feels more like an informal work report rather than a research article.

      Figures and tables also need extensive revision. The figures and tables are important, but the figures and tables titles could be more descriptive and clear. And the notes of them should be separated from the title so that it can be more readable. Particularly, as a “note”, useful information in the figure needs explanation instead of repeating of method. Make sure the formats of all tables are consistent.

      The references could be improved (increase the number of references). There are many formatting mistakes, please check. It is fundamental to understand basic rules of reference, please review all them after that.

      Reviewer 2. Mary Ann Tuli Is the language of sufficient quality?<br> Yes. 1) Line 27. "Genomic" should be "genomic"

      Are all data available and do they match the descriptions in the paper?<br> No. Additional Comments: 1) Line 137. "Multiple alignments were performed using MUSCLE " We need the output of MUSCLE (FASTA).

      2) Line 137. "phylogenetic trees were constructed (figures 1 and 3) in MEGA7 or MEGAX." We need the files underlying the phylogenetic tree (newick). Please indicate which version of MEGA was used for each tree.

      3) Line 139. "Expression data from CGEN was visualized using the pheatmap package of R or Microscoft Excel" Please can you provide a file of the raw data used to produce the heatmap (figures 2a ) and the TMP graph (figure 2b).

      Are the data and metadata consistent with relevant minimum information or reporting standards? Yes. Nomenclature standards have been met. All cited INSDC accession numbers are publicly available.

      Is the data acquisition clear, complete and methodologically sound?<br> Yes. The curation workflow used for community annotation is available via protocols.io , nonetheless the manuscript includes a comprehensive summary which is appropriate.

      Is there sufficient detail in the methods and data-processing steps to allow reproduction?<br> No. See "Are all data available and do they match the descriptions in the paper?" above. Once the additional files are made available I believe reproduction will be possible.

      Is there sufficient data validation and statistical analyses of data quality? Yes

      Is the validation suitable for this type of data? Yes

      Is there sufficient information for others to reuse this dataset or integrate it with other data? Yes

      Any Additional Overall Comments to the Author:<br> 1) Citation [23] MUSCLE. https://www.ebi.ac.uk/Tools/msa/muscle/.

      • the website suggests users of MUSCLE cite DOI:10.1093/nar/gkz268

      Some of my comments/recommendations are pertinent to the other D. citri manuscripts currently under review.

    1. This article is a preprint and has not been certified by peer review [what does this mean?]. Sherry Miller 1Division of Biology, Kansas State University, Manhattan, KS 665062Allen County Community College, Burlingame, KS 66413Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteTeresa D. Shippy 1Division of Biology, Kansas State University, Manhattan, KS 66506Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Teresa D. ShippyBlessy Tamayo 3Indian River State College, Fort Pierce, FL 34981Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Blessy TamayoPrashant S Hosmani 4Boyce Thompson Institute, Ithaca, NY 14853Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Prashant S HosmaniMirella Flores-Gonzalez 4Boyce Thompson Institute, Ithaca, NY 14853Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Mirella Flores-GonzalezLukas A Mueller 4Boyce Thompson Institute, Ithaca, NY 14853Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Lukas A MuellerWayne B Hunter 5USDA-ARS, U.S. Horticultural Research Laboratory, Fort Pierce, FL 34945Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Wayne B HunterSusan J Brown 1Division of Biology, Kansas State University, Manhattan, KS 66506Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Susan J BrownTom D’elia 3Indian River State College, Fort Pierce, FL 34981Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Tom D’eliaSurya Saha 4Boyce Thompson Institute, Ithaca, NY 148536Animal and Comparative Biomedical Sciences, University of Arizona, Tucson, AZ 85721Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Surya SahaFor correspondence: ss2489@cornell.edu

      This work has been peer reviewed in GigaByte (https://doi.org/10.46471/gigabyte.23), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 1. Hailin Liu Is there sufficient data validation and statistical analyses of data quality?<br> No. The validation work is not revealed in the manuscript, such as the qRT-PCR experiment.

      Is the validation suitable for this type of data?<br> No. More validation work should be added instead of the RNA-seq data from the public database.

      Is there sufficient information for others to reuse this dataset or integrate it with other data? Yes

      Any Additional Overall Comments to the Author:<br> Formatting errors should be corrected, including the tables and the alignment method of words. The introduction and methods seemed to be too simple for readers. More biological meanings should be explained in the manuscript. The basic assessment of the utilized genome should be added.

      Recommendation: Major Revision

      Reviewer 2. Mary Ann Tuli Is the language of sufficient quality?<br> Yes. The manuscript reads very well.

      Are all data available and do they match the descriptions in the paper?<br> No. Additional Comments :<br> 1) Line 149. "Multiple alignments of the predicted D. citri proteins and their insect homologs were performed using MUSCLE We need the output of MUSCLE (FASTA).

      2) Line 151. Phylogenetic trees were constructed (figures 1 and 4) using full-length protein sequences in MEGA7or MEGAX. We need the files underlying the phylogenetic tree (newick). Please indicate which version of MEGA was used for each tree.

      3) Line 152. Gene expression levels were obtained from the Citrus greening Expression Network and visualized using Excel and the pheatmap package in R. Please can you provide a file of the raw data used to produce the heatmap (figure 2) and the Expression levels of UAP2 in male and female tissues (figure 5).

      Are the data and metadata consistent with relevant minimum information or reporting standards? Yes. Nomenclature standards have been met.

      All cited INSDC accession numbers are publicly available.

      Is the data acquisition clear, complete and methodologically sound? Yes. The curation workflow used for community annotation is available via protocols.io , nonetheless the manuscript includes a comprehensive summary which is appropriate.

      Is there sufficient detail in the methods and data-processing steps to allow reproduction?<br> No. See "Are all data available and do they match the descriptions in the paper?" above. Once the additional files are made available I believe reproduction will be possible.

      Is there sufficient data validation and statistical analyses of data quality? Yes

      Is the validation suitable for this type of data? Yes

      Is there sufficient information for others to reuse this dataset or integrate it with other data? Yes

      Any Additional Overall Comments to the Author:<br> 1) Line 147. Apollo version should be included in the other D citri manuscripts.

      2) Citation [26] MUSCLE. https://www.ebi.ac.uk/Tools/msa/muscle/.

      • the website suggests users of MUSCLE cite DOI:10.1093/nar/gkz268

      Some of my comments/recommendations are pertinent to the other D. citri manuscripts currently under review.

    1. This article is a preprint and has not been certified by peer review [what does this mean?]. Chad Vosburg 1Indian River State College, Fort Pierce, FL 349812Department of Plant Pathology and Environmental Microbiology, The Pennsylvania State University, University Park, PA 16802Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Chad VosburgMax Reynolds 1Indian River State College, Fort Pierce, FL 34981Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteRita Noel 1Indian River State College, Fort Pierce, FL 34981Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteTeresa Shippy 3KSU Bioinformatics Center, Division of Biology, Kansas State University, Manhattan, KSFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Teresa ShippyPrashant S Hosmani 4Boyce Thompson Institute, Ithaca, NY 14853Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Prashant S HosmaniMirella Flores-Gonzalez 4Boyce Thompson Institute, Ithaca, NY 14853Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Mirella Flores-GonzalezLukas A Mueller 4Boyce Thompson Institute, Ithaca, NY 14853Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Lukas A MuellerWayne B Hunter 5USDA-ARS, U.S. Horticultural Research Laboratory, Fort Pierce, FL 34945Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Wayne B HunterSusan J Brown 3KSU Bioinformatics Center, Division of Biology, Kansas State University, Manhattan, KSFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Susan J BrownTom D’Elia 1Indian River State College, Fort Pierce, FL 34981Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Tom D’EliaSurya Saha 4Boyce Thompson Institute, Ithaca, NY 148536Animal and Comparative Biomedical Sciences, University of Arizona, Tucson, AZ 85721Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Surya SahaFor correspondence: ss2489@cornell.edu

      This work has been peer reviewed in GigaByte (https://doi.org/10.46471/gigabyte.21), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 1. Xingtan Zhang and Dongna Ma

      The manuscript by Vosburg et. al., systematically analyzed of the characteristics of the Wnt signaling genes in Diaphorina citri, and focusing on evolutionary history, expression patterns and potential functional. Finally, they also performed manual annotation of the Wnt signaling pathway. Indeed, this work would add important resource for the study of the evolutionary history of D. citri and Wnt signaling in this important hemipteran vector. The writing is acceptable. Even though, I still have some suggestion which may improve this manuscript.

      1. In the methods, the authors have indicated the process of identifying win genes, but the abstract describes it as Curation identification? I am confused whether this Wnt signaling genes in D. citri were identified by the author or whether the author just further analyzed it using the results already identified by others?
      2. The paper just did the identification of the win gene, evolutionary, and then the expression analysis using RNA-seq. It is recommended to also look at the chromosomal localization and mode of origin (e.g., tandem repeats)
      3. The Wnt signaling genes related to the hemipteran vector studied by the authors can be further verified by qPCR and then compared with the expression and function of other published insect-related genes for discussion.

      Major Revision.

      Reviewer 2. Mary Ann Tuli Is the language of sufficient quality?<br> Yes. The manuscript reads very well.

      Are all data available and do they match the descriptions in the paper?<br> No. Additional Comments:<br> 1) Line 176. "High scoring MCOT models were then searched on the NCBI protein database...." We need the list Wnt pathway genes with high scoring MCOT models.

      2) Line 178. "The high scoring MCOT models that had promising NCBI search results were used to search the D. citri assembled genome." We need the list of high scoring MCOT models which had promising NCBI search results..

      3) Line 179. "Genome regions of high sequence identity to the query sequence were investigated within JBrowse" We need the list of models with high sequence identity with the assembled genome.

      4) Line 184. "MUSCLE multiple sequence alignments of the D. citri gene model sequences and orthologous sequences were created through MEGA7" We need the output of MUSCLE (FASTA). We need the files underlying the phylogenetic tree (newick).

      5) I note that MEGA7 has been used. I wonder why the newer release (MEGAX, March '21) was not used. Furthermore, the annotation protocol (dx.doi.org/10.17504/protocols.io.bniimcce) suggests using Mega7 or MegaX.

      Instructions on how to upload these files is given under "Any Additional Overall Comments to the Author".

      Are the data and metadata consistent with relevant minimum information or reporting standards?<br> Yes. Nomenclature standards have been met. All cited INSDC accession numbers are publicly available.

      Is the data acquisition clear, complete and methodologically sound?<br> Yes. Curation workflow used for community annotation is available via protocols.io , nonetheless the manuscript includes comprehensive summary which is appropriate.

      Is there sufficient detail in the methods and data-processing steps to allow reproduction? No. See "Are all data available and do they match the descriptions in the paper?" above. Once the additional files are made available I believe reproduction will be possible.

      Is there sufficient data validation and statistical analyses of data quality? Yes

      Is the validation suitable for this type of data? Yes

      Is there sufficient information for others to reuse this dataset or integrate it with other data? Yes

      Any Additional Overall Comments to the Author:<br> Some of my comments/recommendations are pertinent to the other D. citri manuscripts currently under review.

      Minor Revision.

    1. This article is a preprint and has not been certified by peer review [what does this mean?].

      This work has been peer reviewed in GigaByte (https://doi.org/10.46471/gigabyte.20), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 1. Feng Cheng. Crissy and co-authors annotated yellow genes in genome of Diaphorina citri, the vector of the Huanglongbing disease in citrus plants. The result is useful for close related area, and here I have some comments for the authors to improve the manuscript.

      1. The sections of introduction and background can be merged into one introduction section.

      2. Many sentences in the results section can be moved to the methods section.

      3. The methods section should be rewritten and re-organized as each analysis per paragraph.

      4. Some domain analysis and figures may be helpful for illustrating the evolution of important yellow genes in different insect species.

      Reviewer 2. Mary Ann Tuli Is the language of sufficient quality?<br> Yes. Line 18 'in-planta' should be in 'in planta' (in italics).

      Are all data available and do they match the descriptions in the paper?<br> No. Additional Comments: 1) line 224. "The MCOT protein sequences were used to search the D. citri genomes" We need the list of MCOT protein sequences that were used.

      2) Line 228. "A neighbor-joining phylogenetic tree of D. citri yellow protein sequences along with was created in MEGA version 7 using the MUSCLE multiple sequence alignment" a) Along with what? There are some words missing. b) We need the output of MUSCLE (FASTA). c) We need the files underlying the phylogenetic tree (newick).

      3) I note that MEGA7 has been used. I wonder why the newer release (MEGAX, March '21) was not used. Furthermore, the annotation protocol (dx.doi.org/10.17504/protocols.io.bniimcce) suggests using Mega7 or MegaX.

      4) Line 233. "Comparative expression levels of yellow proteins throughout different life stages (egg, nymph, and adult) in Candidatus Liberibacter asiaticus (Clas) exposed vs. healthy D. citri insects was determined using RNA-seq data and the Citrus Greening Expression Network (http://cgen.citrusgreening.org)." Results are presented in Fig 3(a) and Fig 3(b) We need the raw data underlying these figures.

      Are the data and metadata consistent with relevant minimum information or reporting standards?<br> Yes. Nomenclature standards have been met. All cited INSDC accession numbers are publicly available.

      Is the data acquisition clear, complete and methodologically sound?<br> Yes. Curation workflow used for community annotation is available via protocols.io , nonetheless the manuscript includes comprehensive summary which is appropriate.

      Is there sufficient detail in the methods and data-processing steps to allow reproduction?<br> No See "Are all data available and do they match the descriptions in the paper?" above. Once the additional files are made available I believe reproduction will be possible.

      Is there sufficient data validation and statistical analyses of data quality? Yes

      Is the validation suitable for this type of data? Yes

      Is there sufficient information for others to reuse this dataset or integrate it with other data? Yes

      Any Additional Overall Comments to the Author:<br> Citation [39] is not complete. It should be MCOT protein database.

      Some of my comments/recommendations are pertinent to the other D. citri manuscripts currently under review.

    1. Abstract

      This work has been peer reviewed in GigaByte, which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 1: Aaron Shafer Is there sufficient detail in the methods and data-processing steps to allow reproduction?

      No. I would include all flags for assemblies even if default; unclear how the 10x + Illumina data were integrated (if at all) - see comments below.

      Is there sufficient data validation and statistical analyses of data quality?<br> Yes. I suppose BUSCO and gene number is a form of validation.

      Is there sufficient information for others to reuse this dataset or integrate it with other data?<br> No. See comment below; while the short-read data is great, the genomic resource I likely would reassemble for a variety of reasons outlined in Additional Comments.

      Any Additional Overall Comments to the Author: The paper is well written, and I have no comments about the the content - well done here. My main concern lies with the genome resources - and in this case I would likely use the raw data, rather than the assemblies provided. I offer my rationale and suggestions:

      My lab was heavily pushed by a colleague towards the use of Meraculous in our short-read assembly of mammal genomes ( https://jgi.doe.gov/data-and-tools/meraculous/ ) ; this is because it’s really designed for short-read assemblies of big genomes (i.e. no addition of mate-pair) AND it performs very well in the Assemblathon metrics https://academic.oup.com/gigascience/article/2/1/2047-217X-2-10/2656129 - notably Figure 16-18 you start to see clear differences between meraculous and say soapdenovo. Thus for just the Illumina data I would very much like to see a more appropriate assembly explored as stats like N50 and no. scaffolds will likely improve considerably with the appropriate methods.

      Likewise, it’s very unclear in the methods how M. r. arvicoloides was assembled: I see SUPERNOVA for the 10X data (great), and probably soapdenovo for the Illumina data (see above). But how were they combined? This sequencing strategy is really designed for a hybrid assembly (see for example DGB2OlC https://github.com/yechengxi/DBG2OLC) this is appropriate for 10X data and really does work! But there are others.

      Note M. agretus that has an identical sequencing strategy to M. r. arvicoloides almost has ~3% the total scaffolds – follow whatever they did! And I will say, while the authors state their genome is on par with other Microtus, this appears true by Table 3, only M. agretus currently has an assembly that I think is at current standards. The level of fragmentation and low BUSCO scores really support re-visiting the assembly suggestions, as I think the current .fasta will be of limited utility in a population or comparative genomics study.

      The gene number is pretty high for a mammal and I worry that’s due to fragmentation. It would be reasonably to only annotate scaffolds >10Kb or 50KB, but then there’s not much of a genome left. Ideally the bulk of your genome (>>90%) would fall on these scaffolds. There is really no sense annotation your small fragments (have you tested for contamination? Note NCBI will do this before allowing for it to be deposited so I suggest it).

      You also align your data to mt genome, this is different than assembling it. You could assemble it (e.g. https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-017-1927-y) and that might be interesting to see if there any differences

      I wish I could be more positive; an assembly like Mercaculous would take a week or so, and so would the hybrid approach, but would be worth it based on my experience with these data.

      Recommendation: Major Revision

      Reviewer 2. Joana Damas.

      Any Additional Overall Comments to the Author:<br> The genomes presented in these work will be extremely valuable tool for Microtus related research. The manuscript is very clear and easy to follow. I have, however, a couple of comments that I hope will further improve it.

      (1) Line 123: I believe more details on the measures used for the selection of the best M. r. macropus are needed. Even though the contiguity of the Discovar genome assembly is higher than the ones generated with SOAPdenovo, the BUSCO score is relatively low (54.5% versus 84% in M. r. arvicoloides, e.g.). Were the BUSCO scores for the other assemblies even lower? Is the Discovar assembly size closer to the estimated genome size?

      (2) Line 131/251: Was there any genome structure verification step for the M. montanus genome assembly? For instance, which percentage of the Illumina reads could be mapped back to the finished genome assembly?

      (3) Line 131/251: Was there a reason not to use a published reference-guided assembly method (e.g. RaGOO and those listed therein) for the assembly of M. montanus genome? These could maybe further improve the assembly or help identify misassemblies. (4) Line 180: the high difference between BUSCO scores for each M. richardsoni subspecies makes me believe that the completeness of the genomes is quite different and the fraction of the genome within repeats might be underrepresented in M. r. macropus and that the subspecies values might be closer than noted here. It is, however, difficult to depict phylogenetic relatedness from Fig. 1 for the other species, for non-experts as myself. It would be helpful to have a phylogeny next to the graph showing species relationships. (5) Please verify Tables 1 and 2. The statistics presented for M. r. macropus do not match for N50 and longest scaffold size.

      Recommendation: Minor Revision

    1. This article is a preprint and has not been certified by peer review [what does this mean?]. Jaclyn Smith 1University of OxfordFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Jaclyn SmithFor correspondence: jaclyn.smith@cs.ox.ac.ukYao Shi 1University of OxfordFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMichael Benedikt 1University of OxfordFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMilos Nikolic 2University of EdinburghFind this author on Google ScholarFind this author on PubMedSearch for this author on this site

      This work has been peer reviewed in GigaScience, which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 1: JianJiong Gao

      In this manuscript, the authors introduced a tool named TraNCE for distributed processing and multimodal data analysis. While the topic and tool are interesting, the writing can be improved. The current manuscript reads more like a technical manual than a scientific paper.

      For example, in the background, the discussion on data modeling in the contexts of multi-omics analysis and distributed systems is extensive, but the writing can be better organized. The examples are helpful, but they are very technical and can be hard to follow. It would be good if the main challenges can be summarized on a high level. It might also be useful to have an example analysis use case to lead the technical discussion on data modeling.

      It is also unclear how are the targeted users of the tool and why distributed computing is needed. For example, in application 1 & 2, it is unclear why distributed computing is necessary.

      Reviewer 2. Umberto Ferraro Petrillo First review:

      The authors propose a new framework, called TraNCE, for automating the design of distributed analysis pipelines over complex biomedical data types. They focus on the problem of unrolling references between different datasets (which can be very large), assuming that these datasets contain complex data types consisting of structured objects containing collections of other objects. By using TraNCE, it is possible to formulate queries over collections of nested data using a very high-level declarative language. Then, these queries are translated by TraNCE in Apache Spark applications able to implement those queries in an efficient and scalable way. Apart from a quick description of the TraNCE framework and of the declarative language it supports, the paper also includes a vast collection of examples of multi-omics analyses conducted using TraNCE on real-world data. I found the contribution proposed by this paper to be very actual. Indeed, there is a flourishing of public multi-omics databases. But, their huge volumes make their analysis difficult and very expensive, if not approached with the right methodologies. Distributed analysis frameworks like Spark can be of help, but they are often not easy to be mastered, especially for those not having deep distributed programming skills. So, TraNCE looks like a very much need contribution on this topic. However, I have some remarks. The high-level querying language supported by TraNCE is not original because, as far as I understand, it has been presented in a previous paper [1] (which has been written by almost the same authors and that has been correctly referenced to in this submission). Even the TraNCE framework is not completely original because its name appears as the name of the project containing the code presented in [1]. Finally, at least one of the experiments presented in [1] seems to have been run on the same Hadoop installation used for the experiments presented in the current submission, and has involved the same datasets from the International Cancer Genome Consortium. So, I am a bit confused about what it is original in this new submission and what has been borrowed from [1]. My advice is to definitely clarify this point.

      Another issue that I think should be addressed is about the proposed framework being scalable. The authors state that the framework supports scalable processing of complex datatypes, however, no evidence is brought about this claim. The several different experiments that are reported seem to focus more on the expressiveness of the proposed language while no experiment about the scalability of the generated code is provided when run on a computing architecture of increasing size. I think we may agree on the fact that using Spark does not means that your code is scalable, neither I think it is enough to say that the scalability of TraNCE has been proved in [1]. So, I would suggest to elaborate also on this. To be honest, I am a bit skeptical about the practical performance of the standard compilation route. I think that when applied to very large datasets it is likely to return huge RDDs that could require very long processing times. Instead, the shredded compilation route looks much clever to me. Could you elaborate further on this difference, especially according to the results of your experimentations? I also disagree with your idea of not describing how data skewness is dealt with in your framework. It is indeed one of the main cause for bad performance of many distributed applications so it would be interesting to know how did you manage this problem in your particular case. On the bright side, I really appreciated the flexibility of the proposed framework, as witnessed by the vast amount of examples provided, as well as its positive implications on the analysis of multi-omics databases.

      Finally, the English of the manuscript is very good and I have not been able to find any typos so far.

      [1] Jaclyn Smith, Michael Benedikt, Milos Nikolic, and Amir Shaikhha. 2020. Scalable querying of nested data. Proc. VLDB Endow. 14, 3 (November 2020), 445-457.

      Re-review: I appreciated the robust revision done by the authors and think the paper is now ready to be published

  6. Aug 2021
    1. ABSTRACT

      This work has been peer reviewed in GigaByte (https://doi.org/10.46471/gigabyte.16), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 1. Inge SeimIs the language of sufficient quality? No. The authors need to polish their English further. This is particularly obvious in the Abstract and is likely to result in an unwarranted lower readership of the work.

      Are all data available and do they match the descriptions in the paper?<br> Yes. I want to commend the authors for sharing data and associated code.

      Is there sufficient data validation and statistical analyses of data quality?<br> Not my area of expertise.

      Any Additional Overall Comments to the Author<br> • R2 should be R^2 (that is, please superscript the '2'). • The sentence 'Further comparison between sequencing platforms would be useful for for exploration using as similar amplification conditions as possible. This data being provided as one such benchmark' at the end of Results is vague and needs to be rewritten. • You need to more clearly state that you do not recommend to combine MGI and Illumina data sets for metabarcoding -- unlike e.g. BGISEQ-500 and Illumina RNA-seq/short-insert WGS data which can be readily combined.

      Recommendation: Minor Revision

      Reviewer 2. Petr Baldrian Are all data available and do they match the descriptions in the paper?<br> No. I was not able to locate the items listed as references (26) and (27). Due to this, I was not able to fully evaluate the paper.

      Are the data and metadata consistent with relevant minimum information or reporting standards?<br> No. I was not able to locate the data, see above.

      Is the data acquisition clear, complete and methodologically sound?<br> No. More details on sampling (mode of sampling, area sampled, depth sampled, sample size, sample handling) is missing. Information on number of repetitive extractions of DNA and the size of sample for extraction is missing. Protocols of amplification and barcoding are referenced as (27), but I was not able to locate this reference. These details have to be provided in the text for both types of sequencers.

      Is there sufficient detail in the methods and data-processing steps to allow reproduction?<br> Yes. For fungal ITS, the ITS region should be extracted before annotation.

      Is there sufficient data validation and statistical analyses of data quality?<br> No. The authors do not report how do they deal with sequences of fungi that produce amplicons longer than 350 bases that can not be pair-end joint in the 2x200 base runs. Even the MiSeq 2x250 runs miss some fungal taxa (though not very many) and here the situation is still worse. For the length distribution of fungal ITS, please consult the UNITE database.

      Is the validation suitable for this type of data?<br> No. There should be additional validations including the analysis of those OTUs that are abundant in one setup but missing in another one (if any).

      Is there sufficient information for others to reuse this dataset or integrate it with other data?<br> No. The metadata, supposedly in reference (26) are impossible to locate.

      Any Additional Overall Comments to the Author<br> I believe that this is a very good attempt to test the novel platform with fungal metabarcoding. If all required information is provided, I believe that this can be both an interesting paper and a valuable dataset.

      Recommendation: Reject (Unsound or Unusuable)

      Reviewer 2. Re-review. I have now carefully read the revised version of this manuscript and I am happy with the changes that the authors implemented as a response to my comments and the comments of the other reviewer. The paper is now much more clear, especially in the methodological section and the limitations of the use of the novel sequencing platforms/formats is sufficiently discussed.

      Minor comments that should be made in the present paper:

      L58: change "bacteria" to "bacterial" L65-66: the last part of this long sentence is difficult to comprehend and should be rephreased. I suggest to divide the long sentence into two L68-69: change "produces" to "produced" L84: delete "in" L98: please explain the abbreviation "ONT", likely "Oxford Nanopore Technologies" L162: the detail of the amplification methods should be expanded at least stating the primer pairs (names and sequences) used and targeted molecular markers; from the text it appears as if ITS2 was the marker selected, yet lines 361 and 366 discuss length differences in ITS1 L246: replace "common fungi several species" with "common fungal species" L248-251: the misclassification of fungal taxa was not due to the bad performance of the sequencing platform, it was because of the low variability of the ITS2 marker. I suggest to change the text to state that genus level assignment was reached for these taxa since multiple species had the same ITS2 sequence L264-265: the main reason is that the PCR bias (preferential PCR amplification of certain templates) skews the representation of taxa if the DNA is mixed prior to amplification L331-346: this section is unclear; it should be specified which primers (primer names and sequences) with what barcodes were used for each conditions; if different primer pairs were used for different sequencing platforms, it is unclear what is the use of this comparison. This should be either clarified and explained all this section may be removed. L381: delete "so" L387-392: I suggest that this part is either removed or it is clearly described why the authors are sure that PCR replicates are not necessary (which is against all present recommendations). While the increasing fidelity of polymerases can be a fact, the main problems with parallel PCR is not errors (due to low fidelity) but random effects where primers align to templates with random frequencies. This statistical effect is impossible to handle by increasing polymerase fidelity while it is easily handled by PCR replication. L424-426: This statement is rather obvious, I suggest to delete it.

    1. Now published in Gigabyte doi: 10.46471/gigabyte.14 Tianlin Pei 1Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai, 201602, China2State Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Shanghai Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, 200032, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMengxiao Yan 1Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai, 201602, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJie Liu 1Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai, 201602, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteMengying Cui 1Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai, 201602, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteYumin Fang 1Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai, 201602, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteBinjie Ge 1Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai, 201602, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteJun Yang 1Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai, 201602, China2State Key Laboratory of Plant Molecular Genetics, CAS Center for Excellence in Molecular Plant Sciences, Shanghai Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, 200032, ChinaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: yangjun@csnbgsh.cn zhaoqing@cemps.ac.cn

      Reviewer 1. C Robin Buell Is the language of sufficient quality?<br> No. The manuscript could be improved with a round of editing for grammar.

      Is there sufficient detail in the methods and data-processing steps to allow reproduction?<br> No. The sequencing, assembly and annotation methods need more details.

      Any Additional Overall Comments to the Author:<br> This manuscript describes the sequencing, assembly, annotation, and analysis of the Tripterygium wilfordii genome. T. wilfordii is a medicinal plant that has long been used in traditional medicine due to its production of alkaloids and triterpenoids; the focus of this study was identify cytochrome P450s involved in biosynthesis of the triterpenoid celastrol.

      Based on the genome assembly metrics, the authors generated a robust representation of the genome sequence. Improvements in the analyses of the genome and in the manuscript would greatly strengthen confidence in the assembly. The authors should provide these metrics and additional information to the manuscript:

      More details on the error correction of the assembly. Based on the methods, both nanopore and Illumina WGS reads were used, however, this is not explicit nor are any metrics of the error correction provided.

      Specifically it is not discussed how the nanopore reads were assembled. A company is cited for the genome assembly. Information on what assembly software that was used must be provided.

      Every software program used, its version, and the parameters used should be provided in the methods. This is often missing.

      The quality of the genome should be confirmed using both alignment of the whole genome shotgun reads and the mRNAseq data. Specific metrics should be provided include: total and percentage of reads that mapped, read pairs that mapped in the correct orientation.

      No details on read quality assessment or trimming are provided

      The CEGMA results should be omitted, this program has been deprecated.

      Line 337: The DNA was sheared not interrupted into fragments Line 343: More details on the library preparation and sequencing for the nanopore reads.

      Do the authors know the genome size of the species based on flow cytometry? Do you know the number of chromosomes that this species has? This should be stated and discussed in context of the assembly size and number of pseudochromosomes

      The genome wide identification of the CYP450 candidates was difficult to follow. This section should be revised so that it is clear how the authors identified their candidate genes. Potentially adding a supplemental figure would be helpful. I found the coexpression pattern extremely difficult to follow. I would not call coexpression patterns coexpression profiles. Specifically I did not understand the sentence on line 202 “However, no….”. Essentially this is just sub-functionalization at the expression level, not that there are two independent pathways.

      The evolution section should be expanded. How divergent are T. wilfordii from P. trichocarpa and R. communis?

      Table 1: Index should be replaced with metric

      Figure S1: What k-Mer was used in the analysis? Figure S5: Unclear what is on the X or y axis. Expand the figure legend.

      The manuscript should be proofed for grammar as there are numerous sentences that need editing.

      Recommendation Major Revision

    2. Tripterygium wilfordii

      Reviewer 2. Xupo Ding Is the language of sufficient quality? The language of one third paragraph is sufficient quality

      Comments This manuscript provided the reference genome assembly of T. wilfordii by using a combined sequencing strategy(Nanopore, Bionano, Illumina, HiSeq, and Pacbio)and functions of two CYP450 genes were identified with enzyme assays in vivo and in vitro. This research also provided valuable information to aid the conservation of resources and help us reveal the evolution of Celastrales and key genes involving in celastrol biosynthesis. However, it should be well improved about the text.

      1. The comma in the title is suggested to remove.

      2. Nothing in biology makes sense in the light of evolution (T.Dobzhansky), the abstract were not presented vitial results in the manuscript, such as gene numbers, repeat percentage, comparative evolutional analysis. The contribution or sense of T.wilfordii genome were not limited in celastrol biosynthesis in Line38-39, it also provide valuable information to aid the conservation of resources and help us reveal the evolution of Celastrales and key gene involving in celastrol biosynthesis.

      3. Nanopore is not an appropriate key word, the equal platforms, Illumina, Bionano, Pacbio and Hi-C, were also presented in the manuscript.

      4. Tales of legendia mentioned (line 59-61) in scientific paper might be controversial.

      5. Line 61-63 were described colloquially. Please consider replace it with The extraction of T.wilfordii bark have been used as a pesticide from ancient times in China, which recoded in the Illustrated Catalogues of Plants published in 1848 firstly.

      6. Line 103-104 is not coherent with the above sentence.

      7. Line 112, the N comprising rate is 0% ?

      8. Line 117-118, Both results indicated that the presented genome is relative complete. This is uncommon and definitely worth negotiating over. This sentence might be contained in the section of discussion even it is credible.

      9. Line 145, the full name should be entered for the mentioning firstly.

      10. Line 150-155, Copia and Gypsy were missed.

      11. The gene families contained TwCYP712K1 and TwCYP712K2 was expanded or contracted in the CAFÉ analysis?

      12. WGCNA might present much more reliable evidence for candidate of TwCYP712K1 and TwCYP712K2, even the pearson's correlation coefficients is the simplified version of WGCNA.

      13. The full peak should be presented in figure 5A and 5B. The data of NMR and MS uploading as the additional file will be enhance credibility of enzyme function.

      14. Line 269-272, the evolution analysis in Figure 2B indicated that the original time of T.wilfordii is earlier than the original times of P.trichocarpa and T.communis, is this suggested that the functions of TwCYP712K1 and TwCYP712K2 has been fused in the evolution of Malpighiales and Celastrales in Figure 6? If the authors insisted these two P450 came from the common ancestor, syntenic analysis of TwCYP712K1 and TwCYP712K2 within T.wilfordii and A.trichopoda, O.sativa or V.vinifera might be credible.

      15. The latin name should be contained complete specie name in all figures, such as T.wil should be replaced with T.wilfordii.

      16. Line322, transcriptom is transcriptome.

      17. Line330, please add the longitude and latitude.

      18. Please revise the English of total pages except the line 327- 509 and 526-599. line 327-509 might come from the concluding report of sequence project.

      19. Line 606. LAST might be BLAST?

      20. I noticed that the genome of T.wilfordii genome have been published on Nature communication in Feb. 2020. So I suggest adding some comparison to their assembly or triptolide synthesis and cite this paper. Mentioning these contents will look fair and also will highlight the special celastrol synthesis of the one you present here.

      Major Revision

    1. Now published in Gigabyte doi: 10.46471/gigabyte.13

      This work has been peer reviewed in GigaByte, which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 1. Yoonjoo Choi Is the language of sufficient quality? Yes There are some minor typos. Perhaps this would not be a matter in other systems or viewer - all "fi" do not appear on my computer (Mac OS Preview), e.g. "affinity" -> "a inity", "artificial" -> "arti cial".

      Is there a clear statement of need explaining what problems the software is designed to solve and who the target audience is? Yes. The purpose of this software is clearly stated and it will be very useful for researchers in relevant research fields.

      Yes. The author recommended running this package on Linux machines, though it is written in Python. It would be great for a non-linux user to run TEPITOPE and BasicMHC1 (for a quick epitope screen). I pip-installed it on both Ubuntu and Mac OS (just to see whether I can run TEPITOPE and BasicMHC1). The installation on Ubuntu was very easy and running fine. The Mac OS installation failed, but perhaps not the trouble of epitopepredict (brew installed Python 3.9.0).

      Have any claims of performance been sufficiently tested and compared to other commonly-used packages? Yes. (Definitely not mandatory at all but) It would be great this package also provides a wrapper for the IEDB tools.

      Recommendation: Minor Revisions.

      Reviewer 2. Jayaraman Valadi. Is the language of sufficient quality? Yes. There are lot of spelling mistakes. Must be corrected before acceptance.

      Is there a clear statement of need explaining what problems the software is designed to solve and who the target audience is? Yes. This is clearly explained In the manuscript

      Is the source code available, and has an appropriate Open Source Initiative license (https://opensource.org/licenses) been assigned to the code? Yes. The source code is available on Github and it works as expected.

      Is installation/deployment sufficiently outlined in the paper and documentation, and does it proceed as outlined? No. The software depends on a number of external soft wares. Installation of the same need to be explained clearly in the manuscript.

      Is the documentation provided clear and user friendly? Yes. Overall the documentation is good. Doc-Strings need minor improvements to make it more comprehensive.

      Is there a clearly-stated list of dependencies, and is the core functionality of the software documented to a satisfactory level? Yes. This is well explained in manuscript.

      Have any claims of performance been sufficiently tested and compared to other commonly-used packages? Yes. Adding a note on comparing the performance of different methods would be useful.

      Additional Comments: The software developed is a python wrapper for a number of epitope prediction methods which are available. Unified architecture allows users to have easy access to all methods and compare the results of each method. Some of these methods/models have to be manually installed before the user can access it through the python wrapper. A new model trained by the authors has also been added additionally. users can utilize this prediction model without having to install any additional dependencies. Salient Features The software also supports visual comparison of predictions Users can select a target protein for epitope scanning users can prediction putative mhc1 and mhc2 epitopes using various predictive models using the python wrapper. Selection of best predictions possible Visual comparison of predictions from different predictive models possible.

      Highlights the positions of putative epitopes on the target protein sequence

      Overall the manuscript and software are quite comprehensive and can be accepted after minor revisions.

      Recommendation: Minor Revisions

    1. doi: 10.1093/gigascience/giaa146

      Reviewer 2. Mile Šikić Reviewer Comments to Author: In their paper Murigneux et. al. made a comparison of three long-read sequencing technologies applied to the de novo assembly of a plant genome, Macadamia jansenii. They generated sequencing data using Pacific Biosciences (Sequel I), Oxford Nanopore Technologies (PromethION), and BGI (single-tube Long Fragment Read) technologies. Sequenced data are assembled using a bunch of state of the art long-read assemblers and hybrid Masurca assembler. Although paper is easy to follow, and this kind of analysis is more than welcomed I have several major and minor concerns. Major concerns 1) The authors use 780 Mbps as the estimated size of the genome. Yet, this is not supported by data. In chapter "Genome size estimation", they present the genome size estimation using K-mer counting, but these sizes are 650 Mbps or less 2) Since the real size of the genome is unknown, It would be worthwhile if authors provide analyses such as those enabled by KAT (Mapleson et al., 2017), which compares the k-mer spectrum of the assembly to the k-mer spectrum of reads (preferably Illumina). For control of the misassembled contigs, authors also might align larger contigs obtained using different tools to compare similarity among them (e.g., using tools such as Gepard or similar). 3) The authors compare assemblies with "Illumina assembly", but it is not clear what that means and why they consider this as a valid comparison. 4) Although they started ONT data analysis with four tools, they perform further analysis on just two tools (Flye and Canu). In addition, for PacBio data, they use three tools (Redbean, Fly and Canu). It is not clear why the authors chose these tools. Canu and Fly have larger N50, larger total length, and the longest contigs. However, this does not take into account possible misassembles. Assemblers might have problems with uncollapsed haplotypes, which can result in assemblies larger than expected. In their recent manuscript, Guiglielmoni et al (https://doi.org/10.1101/2020.03.16.993428) showed that Canu is prone to uncollapsed haplotypes. Also, in this manuscript is presented that using PacBio data Canu produces much longer assemblies than other tools (1.2 Gbps). Therefore, the longer total size of a assembly cannot guaranty a better genome. Furthermore, on ONT data Raven has the second-best initial Busco score (before polishing), and its assembled genome consists of the least number of contigs. Therefore, I deem that the full analysis needs to performed using all tools for both Nanopore and Pacbio data. 5) It would be of interest to a broad community if authors add the computational costs in total cost per genome for each sequencing technolgy. They might compare their machines with AWS or other cloud specified configurations. Besides, it is not clear which types of machines they used. Information from supplementary materials such as GPU, large memory, HPC is not descriptive enough. Minor comments: 1) The authors use the published reference genome of Macadamia integrifolia v2 for comparison. It would be interesting if they can provide us with information about sequencing read technology used for this assembly. 2) The authors mentioned that the newer generation of PacBio sequencing technology (Sequel II) which provides higher accuracy and lower costs. It would also be worth to mention the newer generations of assembly tools such as Canu 2.0, Raven v1.1.5 or Flye Version 2.7.1 It is worth considering Racon for polishing with Illumina reads too. Yet, this is not a requirement, because authors already use state of the art tools.

    2. Now published in GigaScience

      Review 1. Cecile Monat. Reviewer Comments to Author: Introduction part:

      • It would be nice to put the genome size and to indicate the reference genome that is already sequenced and assembled for Macadamia, just to put a context for the people who are not familiar with Macadamia. Methods part:
      • ONT library preparation and sequencing part:
        • What was the reason to used both MinION and PromethION and not only PromethION?
        • For what reason didn't you use the same version of MinKNOW to assemble the MinION (MinKNOW (v1.15.4)) and PromethION (MinKNOW (v3.1.23)) data?
      • Assembly of genomes part:
        • Is there a reason for doing 4 iterations of Racon? And not 3 or 5?
        • Maybe you should precise that Racon is used as an error-correction module and Medaka to create the consensus sequence.
        • "Hybrid assembly was generated with MaSuRCA v3.3.3 (MaSuRCA, RRID:SCR_010691) [32] using the Illumina and the ONT or PacBio reads and using Flye v2.5 to perform the final assembly of corrected mega-reads" this sentence is not very clear to me. Does it mean that you have first used ONT/PacBio data + Illumina on MaSuRCA software to generate what they call "super-reads" and then from this data you used Flye to get the final assemblies?
        • as I understood stLFR is similar to 10x genomics, why not compare this technology data too?
      • Assembly comparison part:
        • "We compared the assemblies with the published reference genome of Macadamia integrifolia v2 (Genbank accession: GCA_900631585.1)." First, I think it is important to add the reference paper. Secondly, I cannot see where did you compare your assemblies with the one published? For me, you compared all your assemblies between each other, but I cannot find any other assembly.
        • when you said "Illumina assembly" do you refer to the Macadamia integrifolia assembly? If so, please clarify it in the rest of the paper, and add the data for this reference genome in your figures. Results part:
      • ONT genome assembly part:
        • Is there any interested to combine MinION and PromethION data? Are there any advantages to combining it?
        • "The genome completeness was slightly better after two iterations of NextPolish (95.5%) than after two iterations of Pilon (95.2%) (Sup Table 1)." Here I would precise that it is the case for the Flye assembly, but surprisingly (at least for me?) after two iterations of NextPolish on the Canu assembly, the results were a little less good as with one iteration. So, depending on the assembler you use, the number of iteration needed might be different.
        • "As an estimation of the base accuracy, we computed the number of mismatches and indels as compared to the Illumina assembly." Here I am not sure which assembly you refer to when you use the "Illumina assembly" term. Do you refer to the Macadamia integrifolia assembly or to the MaSuRCA hybrid assembly? If you refer to the last one, I would suggest using the word hybrid assembly instead of Illumina assembly, it might be confusing.
        • Why not using the Pilon and NextPolish step on the ONT+Illumina (MaSuRCA) assembly since they are tools dedicated to long and short reads polishing?
      • PacBio genome assembly part:
        • Why did you use FALCON as the assembler for PacBio but not for ONT? If I am correct, it is not uniquely build to work on PacBio data but is ok for all long-reads technologies.
        • "Two subsets of reads corresponding to 4 SMRT cells and equivalent to a 43× and 39× coverage were assembled using Flye." why choosing Flye for this analysis? I'm also wondering if this part is necessary since afterward, you do the ONT equivalent coverage which is more interesting for the comparison of the technologies.
        • Comment on the structure: for this paragraph, I would prefer to have first the result with the same assemblers as with the ONT data, and then an explanation of why you choose to perform also a test with FALCON and then the FALCON results.
      • stLFR genome assembly part:
        • Supernova might have been used on PacBio data as well, why not?
        • why not trying to complement PacBio data with stLFR as you did with ONT? Are there any incompatibilities? Discussion part:
      • "The amount of sequencing data produced by each platform corresponds to approximately 84× (PacBio Sequel), 32× (ONT) and 96× (BGI stLFR) coverage of the macadamia genome" I would have put this information into the Results part, but it's only my preference.
      • "For both ONT and PacBio data, the highest assembly contiguity was obtained with a long-read only assembler as compared to an hybrid assembler incorporating both the short and long reads." I would suggest using the term "long-read polished" instead of "long-read only" since the assembly with the best contiguity integrates the Illumina data for the polishing. Tables and figures:
      • Table 2:
        • For this figure, if I understood properly you have chosen the best assembly of each technology. If I am right, then please precise it in the title of the figure. -Figure 1:
        • If I understood properly and here when you write "Base accuracy of assemblies as compared to Illumina assembly" you refer to the Macadamia integrifolia assembly, then I would add the Macadamia integrifolia assembly in this figure, and maybe put a dotted line at the limit of it for each category (InDels and mismatches) so it is easier for the reader to compare with it.
      • Figure 2:
        • Here I would put all the assemblies you had in Figure 1
    3. Comparison of long read methods for sequencing and assembly of a plant genome

      This preprint was published in GigaScience in December, 2020 and has an Update article in GigaByte - https://doi.org/10.46471/gigabyte.24

    1. An efficient and robust laboratory workflow and tetrapod database for larger scale eDNA studies

      This work has been peer reviewed in GigaScience, which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 1: Taylor Wilcox http://dx.doi.org/10.5524/REVIEW.101629

      Reviewer 2: Han Ming Gan http://dx.doi.org/10.5524/REVIEW.101630

    1. Now published in Gigabyte doi: 10.46471/gigabyte.11 Bruno C. Genevcius Department of Genetics and Evolutionary Biology, University of São Paulo, São Paulo, SP, BrazilFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Bruno C. GenevciusFor correspondence: bgenevcius@gmail.comTatiana T. Torres Department of Genetics and Evolutionary Biology, University of São Paulo, São Paulo, SP, BrazilFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Tatiana T. Torres

      This work has been peer reviewed in GigaByte, which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 1. Peter Thorpe. Are all data available and do they match the descriptions in the paper?

      No. They are submitted but still private. These need to be released.

      Final Comments: SRA datasets need to be released.

      Recommendation: Minor Revision

      Reviewer 2. Guillem Ylla.

      Is the language of sufficient quality? While the text is mostly clear, I detected a few spelling mistakes (listed below) and there might be more that escaped my attention. I would recommend the authors to exhaustively check the MS. Line 53: “Stink bug” missing “bug”. Lines 39,58,69, and figures: Mixed usage of “Chinavia impicticornis” and “C. impicticornis”. After first appearance of the full name, authors should be consistent whether they keep using the full name or the abbreviation, but not mixing both.

      Are all data available and do they match the descriptions in the paper?<br> No. The authors report multiple accession numbers from NCBI including a BioProject ID. But they are not open and I was unable to check if the data match the paper descriptions. The TSA accession seems that has not yet been created and the MS displays a placeholder (GIVF00000000) in its place.

      Are the data and metadata consistent with relevant minimum information or reporting standards? No. Missing items from the checklist. 1) "Any perl/python scripts created for analysis process ". In Line 94 “using a custom Perl script [16]”, the authors provide citation but not the code. 2) "Full (not summary) BUSCO results output files (text) ".

      Is the data acquisition clear, complete and methodologically sound?<br> Yes. The end of the fifth nymphal instar dataset was obtained at “seven days after molting from fourth to fifth instar”. Could authors specify how many days is the 5th nymphal instar to have a better idea of how much longer is the 5th nymphal stage.

      Could the authors briefly describe the rationale o behind choosing 5th nymphal and instead of other nymphal stages? They explain why nymphal stages were used instead of adults, but not why the 5th nymphal instar.

      Is there sufficient detail in the methods and data-processing steps to allow reproduction?<br> No. I would appreciate if the authors could share the code/commands for removing redundant reads and performing the assembly as supplementary materials or in GitHub (recommended).

      In the abstract, the authors describe 38,478 transcripts of which 12,665 had GO terms assigned. Is not clear where this number comes from. In line 120 is mentioned that “ 39,478 had successful matches in the NCBI”. Is there a type one of these two numbers (38,478 vs 39,478)? However, the MS says “we only kept contigs that matched to Arthropod species”, and this number is reported to be 33,871. I urge the authors to better explain the steps they followed and clarify where all these numbers come from.

      Is there sufficient data validation and statistical analyses of data quality?<br> Yes. Using the whole insect body often includes contaminant RNAs from the gut microbiome, endosymbionts, viruses, and other microbiological specimens from the cuticles and environment. Since the authors do not filter out reads from possible contaminants before the assembly, I would appreciate it if they could perform a BUSCO analysis using the prokaryote database before and after the selection based on similarity to databases. This would allow estimating the number of contaminants in the original assembly and if they had successfully discarded after the selection.

      Lines 126-127 are not clear. There are 12,665 contigs that have 5,087 GO terms. I deduce that there are 12,665 contigs that have at least 1 GO term, and that they contain 5,087 distinct GO terms. Could authors make it more clear on the text?

      Is there sufficient information for others to reuse this dataset or integrate it with other data?<br> Yes. I don’t think that a dataset consisting of 2-time points (early and late) of the same sarge (nymph 5) can be considered a “developmental transcriptome”. I would urge the authors to change the terminology and title.

      In the abstract, the authors claim that this is the “ first genome-scale study with”. Since the study is only transcriptomic, I find it misleading to define it as “genome-scale study”.

      1- I don’t think that a datasets consisting of 2 time points (early and late) of the same sarge (nymph 5) can be considered a “developmental transcriptome”. I would urge the authors to change the terminology and title.

      2- In the abstract, the authors claim that this is the “ first genome-scale study with”. Since the study is only transcriptomic, I find misleading to define it as “genome-scale study”.

      3- In table 1 and line 117 the authors claim that they generated the highest amount of RNA-seq reads for pentatomids to date. However, for the Halyomorpha halys there are multiple available RNA-seq datasets not mentioned, which taken together I suspect that they would accede the data generated for C. Impicticornis. I would suggest to reduce the tone of this statement of L117.

      4- Additionally, there are at least 3 available genomes for pentatomidaes species. I think that this information should at least be mentioned in the introduction.

      5- In line 61, could the authors define “almost nonexistent”, how many are there?

      Additionally, there are at least 3 available genomes for pentatomidaes species. I think that this information should at least be mentioned in the introduction.

      In line 61, could the authors define “almost nonexistent”, how many are there?

      Recommendation: Minor Revision

    1. Improvements in the Sequencing and Assembly of Plant Genomes

      This manuscript is an Update to a paper published in GigaScience in December 2020. See https://doi.org/10.1093/gigascience/giaa146