4 Matching Annotations
  1. Jul 2018
    1. On 2016 Oct 12, Stephen Ceci commented:

      Below Hilda Bastian criticizes our 2011 article in the Proceedings of the National Academy of Sciences. The criticisms reflect a simplistic rendering of the rich data landscape on women in academic science. Our conclusion was valid in 2011 and since then new scholarship has continued to support it. Below is an abbreviated response to Bastian’s claims, but a somewhat longer account can be found at: http://www.human.cornell.edu/hd/ciws/publications.cfm Claim 1: Our work failed to represent all research on the topic. This criticism does not take into account the quality of the research and the need to use judgment on study inclusion. Rather than calculate mean effect sizes based on all published studies, it is important to down-weight ones that have been refuted or supplanted. We did this in our narrative review in 2011. Nothing we wrote has changed and the intervening research has reinforced our conclusion of gender-neutrality in journal reviews, grant reviews, and tenure-track hiring. For example, Marcia McNutt, editor of Science, wrote "there was some good news from a panel representing major journals…such as the American Chemical Society (ACS) and the American Geophysical Union (AGU)…female authors are published either at a rate proportional to that at which they submit to those journals, or at proportionally higher rates, as compared with their male colleagues." McNutt, 2016, p. 1035) This may surprise those who read claims that women were selected as reviewers less often than their fraction of the submission pool, but it is true: women’s acceptance rates were, if anything, in excess of men’s. This is not cherry-picking, nor can it be erased by aberrations. These are large-scale analyses of acceptance rates of major journals, and it shows the landscape is either gender-fair or women actually have an advantage—in contrast to what Dr. Bastian alleges. The same is true of funding. To illustrate why it is important to move beyond factoring all the studies into a mean effect size, we offer three examples at http://www.human.cornell.edu/hd/ciws/publications.cfm For example Bornmann et al.’s finding of gender bias in funding using a large sample of grant applications. However, Marsh et al. reanalyzed these findings using a multilevel measurement model and arrived at a different conclusion. Bornmann himself was a coauthor on the Marsh et al. publication and agreed that the new finding of gender-neutrality supplanted his earlier one of gender bias. Marsh et al. found that the mean of the weighted effect sizes based on the 353,725 applicants was actually +.02--in favor of women! (see p. 1301): "The most important result of our study is that for grant applications that include disciplines across the higher education community, there is no evidence for any gender effects in favor of men, and even some evidence in favor of women…This lack of gender difference for grant proposals is very robust, as indicated by the lack of study-to-study variation in the results (nonsignificant tests of heterogeneity) and the lack of interaction effects. This non effect of gender generalized across discipline, the different countries (and funding agencies) considered here, and the publication year.” (p. 1311) Marsh, Bornmann, et al. (2009) (DOI: 10.3102/0034654309334143)

      The rest of our paper concerned hiring and journal publishing. We stand by our conclusion in these two domains as well, as the scientific literature since then has supported us. We do not have time or space here to describe in detail the evidence for this assertion, but the interested reader can find much of it in our over 200 analyses (http://psi.sagepub.com/content/15/3/75.abstract?patientinform-links=yes&legid=sppsi;15/3/75 DOI:10.1177/1529100614541236)Unsurprisingly, the PNAS reviewers were knowledgeable about these domains and agreed with our conclusion. It is incumbent on anyone arguing otherwise to subject their evidence to peer review and show how it overturns our conclusion. Does our claim that gender bias in hiring and publishing lacks support mean there are no gender barriers? Of course not; we have written frequently about them: we have discussed an article that Bastian appears to believe we are unaware of—showing differences in letters of recommendation written for women and men. And we have written about other barriers facing women scientists, such as their teaching ratings downgraded and their lower tenure rates in biology and psychology. However, we stand by our claim that the domains of hiring, funding, and publications are largely gender-neutral. Unless peer reviewers who are experts in this area agree there is compelling counter-evidence, we believe our conclusion reflects the best scientific evidence. Claim 2: We failed to specify what we meant by “women”. Bastian points out differences between women of color, class, etc. We agree these are potentially important moderating factors and we applaud researchers who report their data broken down this way. But the literature on peer review, funding, and hiring rarely reports differences by ethnicity, class, or sexual orientation. Most of the few studies to do so emerged after our study was published. Claim 3: Bastian criticized us for not taking into consideration the size and trajectory of fields, suggesting those with large numbers of scholars may overwhelm smaller ones, or the temporal trajectory of some fields is ahead of others. Field-specific gender differences are a valid consideration but in funding they have been small or non-existent according to several large-scale analyses. Jayasinghe et al.’s (2004) comprehensive analysis of gender effects in reviews of grant proposals (10,023 reviews by 6,233 external assessors of 2,331 proposals from 9 different disciplines), found no gender unfairness in any discipline nor any disciplinary x gender. If anyone has compelling evidence of disciplinary bias against women authors and PIs, they should submit it and allow the peer review process judge how compelling it is. As far as differences among fields in their trajectories, we have done extensive analyses on this, which can be found at the same site above. In these analyses we examined temporal changes in 8 disciplines in salary, tenure, promotion, satisfaction, productivity, impact, etc. With some exceptions we alluded to above, the picture was mainly gender-fair. Finally, Bastian raises analytic issues. We agree these are central. This is why we minimized small-scale, poorly-analyzed reports. We gave more attention to large journals and grant agencies that allowed multilevel models, instead of or in addition to Fixed and Random effects analyses that sometimes violated fundamental statistical assumptions. Both Fixed effect and Random-effects models have limitations. (The latter assumes features of the studies themselves contribute to variability in effect sizes independent of random sampling error, whereas multilevel models permit multiple outcomes included without violating statistical assumptions such as the independence of effect sizes from the same study due to using the same funding agency or multiple disciplines within the same funding agency.) Mean effect sizes are not the analytic endpoint when there is systematic variation among studies beyond that accounted for by sampling variability, which is omnipresent in these studies; it is important to determine which study characteristics account for study-to-study variation. In the past, some have cherry-pick aberrations to support claims of bias, and our 2011 report went beyond doing this to situate claims amidst large-scale, well-analyzed studies, minimizing problematic studies. Although women scientists continue to face challenges that we have written about elsewhere, these challenges are not in the three domains of tenure-track hiring, funding, and publishing.

      Steve Ceci and Wendy M. Williams


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

    2. On 2016 Sep 06, Hilda Bastian commented:

      The conclusions of this review are not supported by the findings of the studies included in it, and much of the evidence cited contradicts the authors’ conclusions. The review suffers from extensive methodological weaknesses, particularly study selection bias and selective reporting. Out of hundreds of studies that were likely to be eligible in the 3 main areas they address (Dehdarirad, 2015), they include only 35. It is not a review of 20 years of data: it is a review based on selected data from the last 20 years. The basis for that selection is not reported.

      Their description of the results of these studies includes, in my opinion, severe levels of 2 key types of review spin (Yavchitz A, 2016): misleading reporting and misleading interpretation. The review contains numerous errors in key issues such as reporting numbers and the methodology of studies. Conclusions about the quality of some evidence are drawn by the authors, but the basis for these judgments is unclear and no methodical process for assessing quality is reported or evident.

      The 3 main areas covered by the review – journal publications, grant applications, and hiring – are also at high risk of publication bias, which is not addressed by the review. Discrimination against women is the subject of legislation in most, if not all, the countries in which these studies were done. Journals, funding agencies, and academic institutions may not be enthusiastic about broadcasting evidence of gender bias.

      For example, of the many thousands of science journals published in 2011, only 6 studies are cited, conducted in 8 to 13 journals in 2 areas of science. In one of those, the author approached 24 journals: only 5 agreed to participate (Tregenza, 2002).

      Ceci and Williams conclude that only 4 of the 35 unique studies they cited suggest the possibility of some gender bias. However, in my opinion an additional 7 studies clearly concluded gender bias remained a problem needing consideration, and others found signs suggesting bias may have been present. Altogether, in 19 studies (54%), there is either selective reporting and descriptions that spin study results in the direction of this review’s conclusions, or inaccurate reporting that could affect the weight placed on the evidence by a knowledgeable reader.

      I identified no instance of spin that did not favor the authors’ conclusions. Some of the studies referenced did not address the questions for which they were cited. Several are short reports in letters, 1 relies on a press release, and another is a news report of a talk.

      Variations in disciplines are not adequately addressed. The authors concentrate on time periods as critical, but the evidence shows that not all disciplines have reached the same level of development in relation to gender participation. Issues related to international differences, and different experiences for groups of women who may experience additional discrimination are not addressed. Although the conclusions are universally framed, they do not address women in science outside academia.

      The authors address only 3 possible explanations for women’s underrepresentation in science: discrimination, women’s choices and preferences (especially relating to motherhood), and gender differences in mathematics ability. They argue that only women’s choices, particularly in relation to family, are a big enough factor to explain women’s underrepresentation. What is arguably the dominant hypothesis in the field is not addressed: that men are overrepresented in science because of cumulative advantage. Advantages do not have to be large individually, to contribute to the end result of underrepresentation in elite institutions and positions. (I have also commented on another paper in which they advance their hypothesis about motherhood and women scientists (Williams WM, 2012) - link to comment.)

      In addition, they do not address the full range of issues within the 3 areas they consider. For example, in grants and hiring, they do not address analyses of potential bias in letters of recommendation (e.g. Van Den Brink, 2006, Schmader T, 2007).

      In my opinion, this review is irredeemably flawed and should be retracted.

      My methodological critique and individual notes on studies are included at my blog.

      Disclosures: I work at the National Institutes of Health (NIH), but not in the granting or women in science policy spheres. The views I express are personal, and do not necessarily reflect those of the NIH. I am an academic editor at PLOS Medicine and on the human ethics advisory group for PLOS One. I am undertaking research in various aspects of publication ethics.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

  2. Feb 2018
    1. On 2016 Sep 06, Hilda Bastian commented:

      The conclusions of this review are not supported by the findings of the studies included in it, and much of the evidence cited contradicts the authors’ conclusions. The review suffers from extensive methodological weaknesses, particularly study selection bias and selective reporting. Out of hundreds of studies that were likely to be eligible in the 3 main areas they address (Dehdarirad, 2015), they include only 35. It is not a review of 20 years of data: it is a review based on selected data from the last 20 years. The basis for that selection is not reported.

      Their description of the results of these studies includes, in my opinion, severe levels of 2 key types of review spin (Yavchitz A, 2016): misleading reporting and misleading interpretation. The review contains numerous errors in key issues such as reporting numbers and the methodology of studies. Conclusions about the quality of some evidence are drawn by the authors, but the basis for these judgments is unclear and no methodical process for assessing quality is reported or evident.

      The 3 main areas covered by the review – journal publications, grant applications, and hiring – are also at high risk of publication bias, which is not addressed by the review. Discrimination against women is the subject of legislation in most, if not all, the countries in which these studies were done. Journals, funding agencies, and academic institutions may not be enthusiastic about broadcasting evidence of gender bias.

      For example, of the many thousands of science journals published in 2011, only 6 studies are cited, conducted in 8 to 13 journals in 2 areas of science. In one of those, the author approached 24 journals: only 5 agreed to participate (Tregenza, 2002).

      Ceci and Williams conclude that only 4 of the 35 unique studies they cited suggest the possibility of some gender bias. However, in my opinion an additional 7 studies clearly concluded gender bias remained a problem needing consideration, and others found signs suggesting bias may have been present. Altogether, in 19 studies (54%), there is either selective reporting and descriptions that spin study results in the direction of this review’s conclusions, or inaccurate reporting that could affect the weight placed on the evidence by a knowledgeable reader.

      I identified no instance of spin that did not favor the authors’ conclusions. Some of the studies referenced did not address the questions for which they were cited. Several are short reports in letters, 1 relies on a press release, and another is a news report of a talk.

      Variations in disciplines are not adequately addressed. The authors concentrate on time periods as critical, but the evidence shows that not all disciplines have reached the same level of development in relation to gender participation. Issues related to international differences, and different experiences for groups of women who may experience additional discrimination are not addressed. Although the conclusions are universally framed, they do not address women in science outside academia.

      The authors address only 3 possible explanations for women’s underrepresentation in science: discrimination, women’s choices and preferences (especially relating to motherhood), and gender differences in mathematics ability. They argue that only women’s choices, particularly in relation to family, are a big enough factor to explain women’s underrepresentation. What is arguably the dominant hypothesis in the field is not addressed: that men are overrepresented in science because of cumulative advantage. Advantages do not have to be large individually, to contribute to the end result of underrepresentation in elite institutions and positions. (I have also commented on another paper in which they advance their hypothesis about motherhood and women scientists (Williams WM, 2012) - link to comment.)

      In addition, they do not address the full range of issues within the 3 areas they consider. For example, in grants and hiring, they do not address analyses of potential bias in letters of recommendation (e.g. Van Den Brink, 2006, Schmader T, 2007).

      In my opinion, this review is irredeemably flawed and should be retracted.

      My methodological critique and individual notes on studies are included at my blog.

      Disclosures: I work at the National Institutes of Health (NIH), but not in the granting or women in science policy spheres. The views I express are personal, and do not necessarily reflect those of the NIH. I am an academic editor at PLOS Medicine and on the human ethics advisory group for PLOS One. I am undertaking research in various aspects of publication ethics.


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.

    2. On 2016 Oct 12, Stephen Ceci commented:

      Below Hilda Bastian criticizes our 2011 article in the Proceedings of the National Academy of Sciences. The criticisms reflect a simplistic rendering of the rich data landscape on women in academic science. Our conclusion was valid in 2011 and since then new scholarship has continued to support it. Below is an abbreviated response to Bastian’s claims, but a somewhat longer account can be found at: http://www.human.cornell.edu/hd/ciws/publications.cfm Claim 1: Our work failed to represent all research on the topic. This criticism does not take into account the quality of the research and the need to use judgment on study inclusion. Rather than calculate mean effect sizes based on all published studies, it is important to down-weight ones that have been refuted or supplanted. We did this in our narrative review in 2011. Nothing we wrote has changed and the intervening research has reinforced our conclusion of gender-neutrality in journal reviews, grant reviews, and tenure-track hiring. For example, Marcia McNutt, editor of Science, wrote "there was some good news from a panel representing major journals…such as the American Chemical Society (ACS) and the American Geophysical Union (AGU)…female authors are published either at a rate proportional to that at which they submit to those journals, or at proportionally higher rates, as compared with their male colleagues." McNutt, 2016, p. 1035) This may surprise those who read claims that women were selected as reviewers less often than their fraction of the submission pool, but it is true: women’s acceptance rates were, if anything, in excess of men’s. This is not cherry-picking, nor can it be erased by aberrations. These are large-scale analyses of acceptance rates of major journals, and it shows the landscape is either gender-fair or women actually have an advantage—in contrast to what Dr. Bastian alleges. The same is true of funding. To illustrate why it is important to move beyond factoring all the studies into a mean effect size, we offer three examples at http://www.human.cornell.edu/hd/ciws/publications.cfm For example Bornmann et al.’s finding of gender bias in funding using a large sample of grant applications. However, Marsh et al. reanalyzed these findings using a multilevel measurement model and arrived at a different conclusion. Bornmann himself was a coauthor on the Marsh et al. publication and agreed that the new finding of gender-neutrality supplanted his earlier one of gender bias. Marsh et al. found that the mean of the weighted effect sizes based on the 353,725 applicants was actually +.02--in favor of women! (see p. 1301): "The most important result of our study is that for grant applications that include disciplines across the higher education community, there is no evidence for any gender effects in favor of men, and even some evidence in favor of women…This lack of gender difference for grant proposals is very robust, as indicated by the lack of study-to-study variation in the results (nonsignificant tests of heterogeneity) and the lack of interaction effects. This non effect of gender generalized across discipline, the different countries (and funding agencies) considered here, and the publication year.” (p. 1311) Marsh, Bornmann, et al. (2009) (DOI: 10.3102/0034654309334143)

      The rest of our paper concerned hiring and journal publishing. We stand by our conclusion in these two domains as well, as the scientific literature since then has supported us. We do not have time or space here to describe in detail the evidence for this assertion, but the interested reader can find much of it in our over 200 analyses (http://psi.sagepub.com/content/15/3/75.abstract?patientinform-links=yes&legid=sppsi;15/3/75 DOI:10.1177/1529100614541236)Unsurprisingly, the PNAS reviewers were knowledgeable about these domains and agreed with our conclusion. It is incumbent on anyone arguing otherwise to subject their evidence to peer review and show how it overturns our conclusion. Does our claim that gender bias in hiring and publishing lacks support mean there are no gender barriers? Of course not; we have written frequently about them: we have discussed an article that Bastian appears to believe we are unaware of—showing differences in letters of recommendation written for women and men. And we have written about other barriers facing women scientists, such as their teaching ratings downgraded and their lower tenure rates in biology and psychology. However, we stand by our claim that the domains of hiring, funding, and publications are largely gender-neutral. Unless peer reviewers who are experts in this area agree there is compelling counter-evidence, we believe our conclusion reflects the best scientific evidence. Claim 2: We failed to specify what we meant by “women”. Bastian points out differences between women of color, class, etc. We agree these are potentially important moderating factors and we applaud researchers who report their data broken down this way. But the literature on peer review, funding, and hiring rarely reports differences by ethnicity, class, or sexual orientation. Most of the few studies to do so emerged after our study was published. Claim 3: Bastian criticized us for not taking into consideration the size and trajectory of fields, suggesting those with large numbers of scholars may overwhelm smaller ones, or the temporal trajectory of some fields is ahead of others. Field-specific gender differences are a valid consideration but in funding they have been small or non-existent according to several large-scale analyses. Jayasinghe et al.’s (2004) comprehensive analysis of gender effects in reviews of grant proposals (10,023 reviews by 6,233 external assessors of 2,331 proposals from 9 different disciplines), found no gender unfairness in any discipline nor any disciplinary x gender. If anyone has compelling evidence of disciplinary bias against women authors and PIs, they should submit it and allow the peer review process judge how compelling it is. As far as differences among fields in their trajectories, we have done extensive analyses on this, which can be found at the same site above. In these analyses we examined temporal changes in 8 disciplines in salary, tenure, promotion, satisfaction, productivity, impact, etc. With some exceptions we alluded to above, the picture was mainly gender-fair. Finally, Bastian raises analytic issues. We agree these are central. This is why we minimized small-scale, poorly-analyzed reports. We gave more attention to large journals and grant agencies that allowed multilevel models, instead of or in addition to Fixed and Random effects analyses that sometimes violated fundamental statistical assumptions. Both Fixed effect and Random-effects models have limitations. (The latter assumes features of the studies themselves contribute to variability in effect sizes independent of random sampling error, whereas multilevel models permit multiple outcomes included without violating statistical assumptions such as the independence of effect sizes from the same study due to using the same funding agency or multiple disciplines within the same funding agency.) Mean effect sizes are not the analytic endpoint when there is systematic variation among studies beyond that accounted for by sampling variability, which is omnipresent in these studies; it is important to determine which study characteristics account for study-to-study variation. In the past, some have cherry-pick aberrations to support claims of bias, and our 2011 report went beyond doing this to situate claims amidst large-scale, well-analyzed studies, minimizing problematic studies. Although women scientists continue to face challenges that we have written about elsewhere, these challenges are not in the three domains of tenure-track hiring, funding, and publishing.

      Steve Ceci and Wendy M. Williams


      This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.