On 2016 Oct 12, Stephen Ceci commented:
Below Hilda Bastian criticizes our 2011 article in the Proceedings of the National Academy of Sciences. The criticisms reflect a simplistic rendering of the rich data landscape on women in academic science. Our conclusion was valid in 2011 and since then new scholarship has continued to support it. Below is an abbreviated response to Bastian’s claims, but a somewhat longer account can be found at: http://www.human.cornell.edu/hd/ciws/publications.cfm Claim 1: Our work failed to represent all research on the topic. This criticism does not take into account the quality of the research and the need to use judgment on study inclusion. Rather than calculate mean effect sizes based on all published studies, it is important to down-weight ones that have been refuted or supplanted. We did this in our narrative review in 2011. Nothing we wrote has changed and the intervening research has reinforced our conclusion of gender-neutrality in journal reviews, grant reviews, and tenure-track hiring. For example, Marcia McNutt, editor of Science, wrote "there was some good news from a panel representing major journals…such as the American Chemical Society (ACS) and the American Geophysical Union (AGU)…female authors are published either at a rate proportional to that at which they submit to those journals, or at proportionally higher rates, as compared with their male colleagues." McNutt, 2016, p. 1035) This may surprise those who read claims that women were selected as reviewers less often than their fraction of the submission pool, but it is true: women’s acceptance rates were, if anything, in excess of men’s. This is not cherry-picking, nor can it be erased by aberrations. These are large-scale analyses of acceptance rates of major journals, and it shows the landscape is either gender-fair or women actually have an advantage—in contrast to what Dr. Bastian alleges. The same is true of funding. To illustrate why it is important to move beyond factoring all the studies into a mean effect size, we offer three examples at http://www.human.cornell.edu/hd/ciws/publications.cfm For example Bornmann et al.’s finding of gender bias in funding using a large sample of grant applications. However, Marsh et al. reanalyzed these findings using a multilevel measurement model and arrived at a different conclusion. Bornmann himself was a coauthor on the Marsh et al. publication and agreed that the new finding of gender-neutrality supplanted his earlier one of gender bias. Marsh et al. found that the mean of the weighted effect sizes based on the 353,725 applicants was actually +.02--in favor of women! (see p. 1301): "The most important result of our study is that for grant applications that include disciplines across the higher education community, there is no evidence for any gender effects in favor of men, and even some evidence in favor of women…This lack of gender difference for grant proposals is very robust, as indicated by the lack of study-to-study variation in the results (nonsignificant tests of heterogeneity) and the lack of interaction effects. This non effect of gender generalized across discipline, the different countries (and funding agencies) considered here, and the publication year.” (p. 1311) Marsh, Bornmann, et al. (2009) (DOI: 10.3102/0034654309334143)
The rest of our paper concerned hiring and journal publishing. We stand by our conclusion in these two domains as well, as the scientific literature since then has supported us. We do not have time or space here to describe in detail the evidence for this assertion, but the interested reader can find much of it in our over 200 analyses (http://psi.sagepub.com/content/15/3/75.abstract?patientinform-links=yes&legid=sppsi;15/3/75 DOI:10.1177/1529100614541236)Unsurprisingly, the PNAS reviewers were knowledgeable about these domains and agreed with our conclusion. It is incumbent on anyone arguing otherwise to subject their evidence to peer review and show how it overturns our conclusion. Does our claim that gender bias in hiring and publishing lacks support mean there are no gender barriers? Of course not; we have written frequently about them: we have discussed an article that Bastian appears to believe we are unaware of—showing differences in letters of recommendation written for women and men. And we have written about other barriers facing women scientists, such as their teaching ratings downgraded and their lower tenure rates in biology and psychology. However, we stand by our claim that the domains of hiring, funding, and publications are largely gender-neutral. Unless peer reviewers who are experts in this area agree there is compelling counter-evidence, we believe our conclusion reflects the best scientific evidence. Claim 2: We failed to specify what we meant by “women”. Bastian points out differences between women of color, class, etc. We agree these are potentially important moderating factors and we applaud researchers who report their data broken down this way. But the literature on peer review, funding, and hiring rarely reports differences by ethnicity, class, or sexual orientation. Most of the few studies to do so emerged after our study was published. Claim 3: Bastian criticized us for not taking into consideration the size and trajectory of fields, suggesting those with large numbers of scholars may overwhelm smaller ones, or the temporal trajectory of some fields is ahead of others. Field-specific gender differences are a valid consideration but in funding they have been small or non-existent according to several large-scale analyses. Jayasinghe et al.’s (2004) comprehensive analysis of gender effects in reviews of grant proposals (10,023 reviews by 6,233 external assessors of 2,331 proposals from 9 different disciplines), found no gender unfairness in any discipline nor any disciplinary x gender. If anyone has compelling evidence of disciplinary bias against women authors and PIs, they should submit it and allow the peer review process judge how compelling it is. As far as differences among fields in their trajectories, we have done extensive analyses on this, which can be found at the same site above. In these analyses we examined temporal changes in 8 disciplines in salary, tenure, promotion, satisfaction, productivity, impact, etc. With some exceptions we alluded to above, the picture was mainly gender-fair. Finally, Bastian raises analytic issues. We agree these are central. This is why we minimized small-scale, poorly-analyzed reports. We gave more attention to large journals and grant agencies that allowed multilevel models, instead of or in addition to Fixed and Random effects analyses that sometimes violated fundamental statistical assumptions. Both Fixed effect and Random-effects models have limitations. (The latter assumes features of the studies themselves contribute to variability in effect sizes independent of random sampling error, whereas multilevel models permit multiple outcomes included without violating statistical assumptions such as the independence of effect sizes from the same study due to using the same funding agency or multiple disciplines within the same funding agency.) Mean effect sizes are not the analytic endpoint when there is systematic variation among studies beyond that accounted for by sampling variability, which is omnipresent in these studies; it is important to determine which study characteristics account for study-to-study variation. In the past, some have cherry-pick aberrations to support claims of bias, and our 2011 report went beyond doing this to situate claims amidst large-scale, well-analyzed studies, minimizing problematic studies. Although women scientists continue to face challenges that we have written about elsewhere, these challenges are not in the three domains of tenure-track hiring, funding, and publishing.
Steve Ceci and Wendy M. Williams
This comment, imported by Hypothesis from PubMed Commons, is licensed under CC BY.