- Last 7 days
-
-
This article provides a brief history and review of peer review. It evaluates peer review models against the goals of scientific communication, expressing a preference for publish, review, curate (PRC) models. The review and history are useful. However, the article’s progression and arguments, along with what it seeks to contribute to the literature need refinement and clarification. The argument for PRC is under-developed due to a lack of clarity about what the article means by scientific communication. Clarity here might make the endorsement of PRC seem like less of a foregone conclusion.
As an important corollary, and in the interest of transparency, I declare that I am a founding managing editor of MetaROR, which is a PRC platform. It may be advisable for the author to make a similar declaration because I understand that they are affiliated with one of the universities involved in the founding of MetaROR.
Recommendations from the editor
I strongly endorse the main theme of most of the reviews, which is that the progression and underlying justifications for this article’s arguments needs a great deal of work. In my view, this article’s main contribution seems to be the evaluation of the three peer review models against the functions of scientific communication. I say ‘seems to be’ because the article is not very clear on that and I hope you will consider clarifying what your manuscript seeks to add to the existing work in this field.
In any case, if that assessment of the three models is your main contribution, that part is somewhat underdeveloped. Moreover, I never got the sense that there is clear agreement in the literature about what the tenets of scientific communication are. Note that scientific communication is a field in its own right. C
I also agree that paper is too strongly worded at times, with limitations and assumptions in the analysis minimised or not stated. For example, all of the typologies and categories drawn could easily be reorganised and there is a high degree of subjectivity in this entire exercise. Subjective choices should be highlighted and made salient for the reader.
Note that greater clarity, rigour, and humility may also help with any alleged or actual bias.
Some more minor points are:
-
I agree with Reviewer 3 that the ‘we’ perspective is distracting.
-
The paragraph starting with ‘Nevertheless’ on page 2 is very long.
-
There are many points where language could be shortened for readability, for example:
-
Page 3: ‘decision on publication’ could be ‘publication decision’.
-
Page 5: ‘efficiency of its utilization’ could be ‘its efficiency’.
-
Page 7: ‘It should be noted…’ could be ‘Note that…’.
-
-
Page 7: ‘It should be noted that..’ – this needs a reference.
-
I’m not sure that registered reports reflect a hypothetico-deductive approach (page 6). For instance, systematic reviews (even non-quantitative ones) are often published as registered reports and Cochrane has required this even before the move towards registered reports in quantitative psychology.
-
I agree that modular publishing sits uneasily as its own chapter.
-
Page 14: ‘The "Publish-Review-Curate" model is universal that we expect to be the future of scientific publishing. The transition will not happen today or tomorrow, but in the next 5-10 years, the number of projects such as eLife, F1000Research, Peer Community in, or MetaROR will rapidly increase’. This seems overly strong (an example of my larger critique and that of the reviewers).
-
-
In "Evolution of Peer Review in Scientific Communication", Kochetkov provides a point-of-view discussion of the current state of play of peer review for scientific literature, focussing on the major models in contemporary use and recent innovations in reform. In particular, they present a typology of three main forms of peer review: traditional pre-publication review; registered reports; and post-publication review, their preferred model. The main contribution it could make would be to help consolidate typologies and terminologies, to consolidate major lines of argument and to present some useful visualisations of these. On the other hand, the overall discussion is not strongly original in character.
The major strength of this article is that the discussion is well-informed by contemporary developments in peer-review reform. The typology presented is modest and, for that, readily comprehensible and intuitive. This is to some extent a weakness as well as a strength; a typology that is too straightforward may not be useful enough. As suggested at the end it might be worth considering how to complexify the typology at least at subordinate levels without sacrificing this strength. The diagrams of workflows are particularly clear.
The primary weakness of this article is that it presents itself as an 'analysis' from which they 'conclude' certain results such as their typology, when this appears clearly to be an opinion piece. In my view, this results in a false claim of objectivity which detracts from what would otherwise be an interesting and informative, albeit subjective, discussion, and thus fails to discuss the limitations of this approach. A secondary weakness is that the discussion is not well structured and there are some imprecisions of expression that have the potential to confuse, at least at first.
This primary weakness is manifested in several ways. The evidence and reasoning for claims made is patchy or absent. One instance of the former is the discussion of bias in peer review. There are a multitude of studies of such bias and indeed quite a few meta-analyses of these studies. A systematic search could have been done here but there is no attempt to discuss the totality of this literature. Instead, only a few specific studies are cited. Why are these ones chosen? We have no idea. To this extent I am not convinced that the references used here are the most appropriate. Instances of the latter are the claim that "The most well-known initiatives at the moment are ResearchEquals and Octopus" for which no evidence is provided, the claim that "we believe that journal-independent peer review is a special case of Model 3" for which no further argument is provided, and the claim that "the function of being the "supreme judge" in deciding what is "good" and "bad" science is taken on by peer review" for which neither is provided.
A particular example of this weakness, which is perhaps of marginal importance to the overall paper but of strong interest to this reviewer is the rather odd engagement with history within the paper. It is titled "Evolution of Peer Review" but is really focussed on the contemporary state-of-play. Section 2 starts with a short history of peer review in scientific publishing, but that seems intended only to establish what is described as the 'traditional' model of peer review. Given that that short history had just shown how peer review had been continually changing in character over centuries - and indeed Kochetkov goes on to describe further changes - it is a little difficult to work out what 'traditional' might mean here; what was 'traditional' in 2010 was not the same as what was 'traditional' in 1970. It is not clear how seriously this history is being taken. Kochetkov has earlier written that "as early as the beginning of the 21st century, it was argued that the system of peer review is 'broken'" but of course criticisms - including fundamental criticisms - of peer review are much older than this. Overall, this use of history seems designed to privilege the experience of a particular moment in time, that coincides with the start of the metascience reform movement.
Section 2 also demonstrates some of the second weakness described, a rather loose structure. Having moved from a discussion of the history of peer review to detail the first model, 'traditional' peer review, it then also goes on to describe the problems of this model. This part of the paper is one of the best - and best -evidenced. Given the importance of it to the main thrust of the discussion it should probably have been given more space as a Section all on its own.
Another example is Section 4 on Modular Publishing, in which Kochetkov notes "Strictly speaking, modular publishing is primarily an innovative approach for the publishing workflow in general rather than specifically for peer review." Kochetkov says "This is why we have placed this innovation in a separate category" but if it is not an innovation in peer review, the bigger question is 'Why was it included in this article at all?'.
One example of the imprecisions of language is as follows. The author also shifts between the terms 'scientific communication' and 'science communication' but, at least in many contexts familiar to this reviewer, these are not the same things, the former denoting science-internal dissemination of results through publication (which the author considers), conferences and the like (which the author specifically excludes) while the latter denotes the science-external public dissemination of scientific findings to non-technical audiences, which is entirely out of scope for this article.
A final note is that Section 3, while an interesting discussion, seems largely derivative from a typology of Waltman, with the addition of a consideration of whether a reform is 'radical' or 'incremental', based on how 'disruptive' the reform is. Given that this is inherently a subjective decision, I wonder if it might not have been more informative to consider 'disruptiveness' on a scale and plot it accordingly. This would allow for some range to be imagined for each reform as well; surely reforms might be more or less disruptive depending on how they are implemented. Given that each reform is considered against each model, it is somewhat surprising that this is not presented in a tabular or graphical form.
Beyond the specific suggestions in the preceding paragraphs, my suggestions to improve this article are as follows:
-
Reconceptualize this as an opinion piece. Where systematic evidence can be drawn upon to make points, use that, but don't be afraid to just present a discussion from what is clearly a well-informed author.
-
Reconsider the focus on history and 'evolution' if the point is about the current state of play and evaluation of reforms (much as I would always want to see more studies on the history and evolution of peer review).
-
Consider ways in which the typology might be expanded, even if at subordinate level.
I have no competing interests in the compilation of this review, although I do have specific interests as noted above.
-
-
The work ‘Evolution of Peer Review in Scientific Communication’ provides a concise and readable summary of the historical role of peer review in modern science. The paper categorises the peer review practices into three models: (1) traditional pre-publication peer review; (2) registered reports; (3) post-publication peer review. The author compares the three models and draws the conclusion that the “third model offers the best way to implement the main function of scientific communication”.
I would contest this conclusion. In my eyes the three models serve different aims - with more or less drawbacks. For example, although Model 3 is less chance to insert bias to the readers, it also weakens the filtering function of the review system. Let’s just think about the dangers of machine-generated articles, paper-mills, p-hacked research reports and so on. Although the editors do some pre-screening for the submissions, in a world with only Model 3 peer review the literature could easily get loaded with even more ‘garbage’ than in a model where additional peers help the screening.
Compared to registered reports other aspects can come to focus that Model 3 cannot cover. It’s the efficiency of researchers’ work. In the care of registered reports, Stage 1 review can still help researchers to modify or improve their research design or data collection method. Empirical work can be costly and time-consuming and post-publication review can only say that “you should have done it differently then it would make sense”.
Finally, the author puts openness as a strength of Model 3. In my eyes, openness is a separate question. All models can work very openly and transparently in the right circumstances. This dimension is not an inherent part of the models.
In conclusion, I would not make verdict over the models, instead emphasise the different functions they can play in scientific communication.
A minor comment: I found that a number of statements lack references in the Introduction. I would have found them useful for statements such as “There is a point of view that peer review is included in the implicit contract of the researcher.”
-
In this manuscript, the author provides a historical review of the place of peer review in the scientific ecosystem, including a discussion of the so-called current crisis and a presentation of three important peer review models. I believe this is a non-comprehensive yet useful overview. My main contention is that the structure of the paper could be improved. More specifically, the author could expand on the different goals of peer review and discuss these goals earlier in the paper. This would allow readers to better interpret the different issues plaguing peer review and helps put the costs and benefits of the three models into context. Other than that, I found some claims made in the paper a little too strong. Presenting some empirical evidence or downplaying these claims would improve the manuscript in my opinion. Below, you can find my comments:
-
In my view, the biggest issue with the current peer review system is the low quality of reviews, but the manuscript only mentions this fleetingly. The current system facilitates publication bias, confirmation bias, and is generally very inconsistent. I think this is partly due to reviewers’ lack of accountability in such a closed peer review system, but I would be curious to hear the author’s ideas about this, more elaborately than they provide them as part of issue 2.
-
I’m missing a section in the introduction on what the goals of peer review are or should be. You mention issues with peer review, and these are mostly fair, but their importance is only made salient if you link them to the goals of peer review. The author does mention some functions of peer review later in the paper, but I think it would be good to expand that discussion and move it to a place earlier in the manuscript.
-
Table 1 is intuitive but some background on how the author arrived at these categorizations would be welcome. When is something incremental and when is something radical? Why are some innovations included but not others (e.g., collaborative peer review, see https://content.prereview.org/how-collaborative-peer-review-can-transform-scientific-research/)?
-
“Training of reviewers through seminars and online courses is part of the strategies of many publishers. At the same time, we have not been able to find statistical data or research to assess the effectiveness of such training.” (p. 5) There is some literature on this, although not recent. See work by Sara Schroter for example, Schroter et al., 2004; Schroter et al., 2008)
-
“It should be noted that most initiatives aimed at improving the quality of peer review simultaneously increase the costs.” (p. 7) This claim needs some support. Please explicate why this typically is the case and how it should impact our evaluations of these initiatives.
-
I would rephrase “Idea of the study” in Figure 2 since the other models start with a tangible output (the manuscript). This is the same for registered reports where they submit a tangible report including hypotheses, study design, and analysis plan. In the same vein, I think study design in the rest of the figure might also not be the best phrasing. Maybe the author could use the terminology used by COS (Stage 1 manuscript, and Stage 2 manuscript, see Details & Workflow tab of https://www.cos.io/initiatives/registered-reports). Relatedly, “Author submits the first version of the manuscript” in the first box after the ‘Manuscript (report)’ node maybe a confusing phrase because I think many researchers see the first version of the manuscript as the stage 1 report sent out for stage 1 review.
-
One pathway that is not included in Figure 2 is that authors can decide to not conduct the study when improvements are required. Relatedly, in the publish-review-curate model, is revising the manuscripts based on the reviews not optional as well? Especially in the case of 3a, authors can hardly be forced to make changes even though the reviews are posted on the platform.
-
I think the author should discuss the importance of ‘open identities’ more. This factor is now not explicitly included in any of the models, while it has been found to be one of the main characteristics of peer review systems (Ross-Hellauer, 2017). More generally, I was wondering why the author chose these three models and not others. What were the inclusion criteria for inclusion in the manuscript? Some information on the underlying process would be welcome, especially when claims like “However, we believe that journal-independent peer review is a special case of Model 3 (“Publish-Review-Curate”).” are made without substantiation.
-
Maybe it helps to outline the goals of the paper a bit more clearly in the introduction. This helps the reader to know what to expect.
-
The Modular Publishing section is not inherently related to peer review models, as you mention in the first sentence of that paragraph. As such, I think it would be best to omit this section entirely to maintain the flow of the paper. Alternatively, you could shortly discuss it in the discussion section but a separate paragraph seems too much from my point of view.
-
Labeling model 3 as post-publication review might be confusing to some readers. I believe many researchers see post-publication review as researchers making comments on preprints, or submitting commentaries to journals. Those activities are substantially different from the publish-review-curate model so I think it is important to distinguish between these types.
-
I do not think the conclusions drawn below Table 3 logically follow from the earlier text. For example, why are “all functions of scientific communication implemented most quickly and transparently in Model 3”? It could be that the entire process takes longer in Model 3 (e.g. because reviewers need more time), so that Model 1 and Model 2 lead to outputs quicker. The same holds for the following claim: “The additional costs arising from the independent assessment of information based on open reviews are more than compensated by the emerging opportunities for scientific pluralism.” What is the empirical evidence for this? While I personally do think that Model 3 improves on Model 1, emphatic statements like this require empirical evidence. Maybe the author could provide some suggestions on how we can attain this evidence. Model 2 does have some empirical evidence underpinning its validity (see Scheel, Schijen, Lakens, 2021; Soderberg et al., 2021; Sarafoglou et al. 2022) but more meta-research inquiries into the effectiveness and cost-benefits ratio of registered reports would still be welcome in general.
-
What is the underlaying source for the claim that openness requires three conditions?
-
“If we do not change our approach, science will either stagnate or transition into other forms of communication.” (p. 2) I don’t think this claim is supported sufficiently strongly. While I agree there are important problems in peer review, I think would need to be a more in-depth and evidence-based analysis before claims like this can be made.
-
On some occasions, the author uses “we” while the study is single authored.
-
Figure 1: The top-left arrow from revision to (re-)submission is hidden
-
“The low level of peer review also contributes to the crisis of reproducibility in scientific research (Stoddart, 2016).” (p. 4) I assume the author means the low quality of peer review.
-
“Although this crisis is due to a multitude of factors, the peer review system bears a significant responsibility for it.” (p. 4) This is also a big claim that is not substantiated
-
“Software for automatic evaluation of scientific papers based on artificial intelligence (AI) has emerged relatively recently” (p. 5) The author could add RegCheck (https://regcheck.app/) here, even though it is still in development. This tool is especially salient in light of the finding that preregistration-paper checks are rarely done as part of reviews (see Syed, 2023)
-
There is a typo in last box of Figure 1 (“decicion” instead of “decision”). I also found typos in the second box of Figure 2, where “screns” should be “screens”, and the author decision box where “desicion” should be “decision”
-
Maybe it would be good to mention results blinded review in the first paragraph of 3.2. This is a form of peer review where the study is already carried out but reviewers are blinded to the results. See work by Locascio (2017), Grand et al. (2018), and Woznyj et al. (2018).
-
Is “Not considered for peer review” in figure 3b not the same as rejected? I feel that it is rejected in the sense that neither the manuscript not the reviews will be posted on the platform.
-
“In addition to the projects mentioned, there are other platforms, for example, PREreview12, which departs even more radically from the traditional review format due to the decentralized structure of work.” (p. 11) For completeness, I think it would be helpful to add some more information here, for example why exactly decentralization is a radical departure from the traditional model.
-
“However, anonymity is very conditional - there are still many “keys” left in the manuscript, by which one can determine, if not the identity of the author, then his country, research group, or affiliated organization.” (p.11) I would opt for the neutral “their” here instead of “his”, especially given that this is a paragraph about equity and inclusion.
-
“Thus, “closeness” is not a good way to address biases.” (p. 11) This might be a straw man argument because I don’t believe researchers have argued that it is a good method to combat biases. If they did, it would be good to cite them here. Alternatively, the sentence could be omitted entirely.
-
I would start the Modular Publishing section with the definition as that allows readers to interpret the other statements better.
-
It would be helpful if the Models were labeled (instead of using Model 1, Model 2, and Model 3) so that readers don’t have to think back what each model involved.
-
Table 2: “Decision making” for the editor’s role is quite broad, I recommend to specify and include what kind of decisions need to be made.
-
Table 2: “Aim of review” – I believe the aim of peer review differs also within these models (see the “schools of thought” the author mentions earlier), so maybe a statement on what the review entails would be a better way to phrase this.
-
Table 2: One could argue that the object of the review’ in Registered Reports is also the manuscript as a whole, just in different stages. As such, I would phrase this differently.
Good luck with any revision!
Olmo van den Akker (ovdakker@gmail.com)
References
Grand, J. A., Rogelberg, S. G., Banks, G. C., Landis, R. S., & Tonidandel, S. (2018). From outcome to process focus: Fostering a more robust psychological science through registered reports and results-blind reviewing. Perspectives on Psychological Science, 13(4), 448-456.
Ross-Hellauer, T. (2017). What is open peer review? A systematic review. F1000Research, 6.
Sarafoglou, A., Kovacs, M., Bakos, B., Wagenmakers, E. J., & Aczel, B. (2022). A survey on how preregistration affects the research workflow: Better science but more work. Royal Society Open Science, 9(7), 211997.
Scheel, A. M., Schijen, M. R., & Lakens, D. (2021). An excess of positive results: Comparing the standard psychology literature with registered reports. Advances in Methods and Practices in Psychological Science, 4(2), 25152459211007467.
Schroter, S., Black, N., Evans, S., Carpenter, J., Godlee, F., & Smith, R. (2004). Effects of training on quality of peer review: randomised controlled trial. Bmj, 328(7441), 673.
Schroter, S., Black, N., Evans, S., Godlee, F., Osorio, L., & Smith, R. (2008). What errors do peer reviewers detect, and does training improve their ability to detect them?. Journal of the Royal Society of Medicine, 101(10), 507-514.
Soderberg, C. K., Errington, T. M., Schiavone, S. R., Bottesini, J., Thorn, F. S., Vazire, S., ... & Nosek, B. A. (2021). Initial evidence of research quality of registered reports compared with the standard publishing model. Nature Human Behaviour, 5(8), 990-997.
Syed, M. (2023). Some data indicating that editors and reviewers do not check preregistrations during the review process. PsyArXiv Preprints.
Locascio, J. J. (2017). Results blind science publishing. Basic and applied social psychology, 39(5), 239-246.
Woznyj, H. M., Grenier, K., Ross, R., Banks, G. C., & Rogelberg, S. G. (2018). Results-blind review: A masked crusader for science. European Journal of Work and Organizational Psychology, 27(5), 561-576.
-
-
Overall thoughts: This is an interesting history piece regarding peer review and the development of review over time. Given the author’s conflict of interest and association with the Centre developing MetaROR, I think that this paper might be a better fit for an information page or introduction to the journal and rationale for the creation of MetaROR, rather than being billed as an independent article. Alternatively, more thorough information about advantages to pre-publication review or more downsides/challenges to post-publication review might make the article seem less affiliated. I appreciate seeing the history and current efforts to change peer review, though I am not comfortable broadly encouraging use of these new approaches based on this article alone.
Page 3: It’s hard to get a feel for the timeline given the dates that are described. We have peer review becoming standard after WWII (after 1945), definitively established by the second half of the century, an example of obligatory peer review starting in 1976, and in crisis by the end of the 20th century. I would consider adding examples that better support this timeline – did it become more common in specific journals before 1976? Was the crisis by the end of the 20th century something that happened over time or something that was already intrinsic to the institution? It doesn’t seem like enough time to get established and then enter crisis, but more details/examples could help make the timeline clear.
Consider discussing the benefits of the traditional model of peer review.
Table 1 – Most of these are self-explanatory to me as a reader, but not all. I don’t know what a registered report refers to, and it stands to reason that not all of these innovations are familiar to all readers. You do go through each of these sections, but that’s not clear when I initially look at the table. Consider having a more informative caption. Additionally, the left column is “Course of changes” here but “Directions” in text. I’d pick one and go with it for consistency.
3.2: Considering mentioning your conflict of interest here where MetaROR is mentioned.
With some of these methods, there’s the ability to also submit to a regular journal. Going to a regular journal presumably would instigate a whole new round of review, which may or may not contradict the previous round of post-publication review and would increase the length of time to publication by going through both types. If someone has a goal to publish in a journal, what benefit would they get by going through the post-publication review first, given this extra time?
There’s a section talking about institutional change (page 14). It mentions that openness requires three conditions – people taking responsibility for scientific communication, authors and reviewers, and infrastructure. I would consider adding some discussion of readers and evaluators. Readers have to be willing to accept these papers as reliable, trustworthy, and respectable to read and use the information in them. Evaluators such as tenure committees and potential employers would need to consider papers submitted through these approaches as evidence of scientific scholarship for the effort to be worthwhile for scientists.
Based on this overview, which seems somewhat skewed towards the merits of these methods (conflict of interest, limited perspective on downsides to new methods/upsides to old methods), I am not quite ready to accept this effort as equivalent of a regular journal and pre-publication peer review process. I look forward to learning more about the approach and seeing this review method in action and as it develops.
-
Kochetkov, D. (2024, March 21). Evolution of Peer Review in Scientific Communication. https://doi.org/10.31235/osf.io/b2ra3
-
Jul 26, 2024
-
Nov 20, 2024
-
Nov 20, 2024
-
Authors:
- Dmitry Kochetkov (Leiden University ) d.kochetkov@cwts.leidenuniv.nl
-
5
-
10.31235/osf.io/b2ra3
-
Evolution of Peer Review in Scientific Communication
-
-
osf.io osf.io
-
In this article the authors use a discrete choice experiment to study how health and medical researchers decide where to publish their research, showing the importance of impact factors in these decisions. The article has been reviewed by two reviewers. The reviewers consider the work to be robust, interesting, and clearly written. The reviewers have some suggestions for improvements. One suggestion is to emphasize more strongly that the study focuses on the health and medical sciences and to reflect on the extent to which the results may generalize to other fields. Another suggestion is to strengthen the embedding of the article in the literature. Reviewer 2 also suggests to extend the discussion of the sample selection and to address in more detail the question of why impact factors still persist.
Competing interest: Ludo Waltman is Editor-in-Chief of MetaROR working with Adrian Barnett, a co-author of the article and a member of the editorial team of MetaROR.
-
In "Researchers Are Willing to Trade Their Results for Journal Prestige: Results from a Discrete Choice Experiment", the authors investigate researchers’ publication preferences using a discrete choice experiment in a cross-sectional survey of international health and medical researchers. The study investigates publishing decisions in relation to negotiation of trade-offs amongst various factors like journal impact factor, review helpfulness, formatting requirements, and usefulness for promotion in their decisions on where to publish. The research is timely; as the authors point out, reform of research assessment is currently a very active topic. The design and methods of the study are suitable and robust. The use of focus groups and interviews in developing the attributes for study shows care in the design. The survey instrument itself is generally very well-designed, with important tests of survey fatigue, understanding (dominant choice task) and respondent choice consistency (repeat choice task) included. Respondent performance was good or excellent across all these checks. Analysis methods (pMMNL and latent class analysis) are well-suited to the task. Pre-registration and sharing of data and code show commitment to transparency. Limitations are generally well-described.
In the below, I give suggestions for clarification/improvement. Except for some clarifications on limitations and one narrower point (reporting of qualitative data analysis methods), my suggestions are only that – the preprint could otherwise stand, as is, as a very robust and interesting piece of scientific work.
-
Respondents come from a broad range of countries (63), with 47 of those countries represented by fewer than 10 respondents. Institutional cultures of evaluation can differ greatly across nations. And we can expect variability in exposure to the messages of DORA (seen, for example, in level of permeation of DORA as measured by signatories in each country, https://sfdora.org/signers/)..%3B!!NVzLfOphnbDXSw!HdeyeHHei6yWQHFjhN3deSSfp82ur9i9JNOLEVOYZN0BvyslUO2S8DlvjBbautmafJEvlUsxQZbT0JLQX7lO8EcOYtZsJkA%24&data=05%7C02%7Ca.l.brasil.varandas.pinto%40cwts.leidenuniv.nl%7C9f47a111adec49d04bb608dd0614ae94%7Cca2a7f76dbd74ec091086b3d524fb7c8%7C0%7C0%7C638673408085242099%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=by5mhPfSM0MFFG9LE2iiYjdtSs5IhvpuukqVv%2FLak2s%3D&reserved=0 "https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.com%2Fv3%2F__https%3A%2F%2Fsfdora.org%2Fsigners%2F).%3B!!NVzLfOphnbDXSw!HdeyeHHei6yWQHFjhN3deSSfp82ur9i9JNOLEVOYZN0BvyslUO2S8DlvjBbautmafJEvlUsxQZbT0JLQX7lO8EcOYtZsJkA%24&data=05%7C02%7Ca.l.brasil.varandas.pinto%40cwts.leidenuniv.nl%7C9f47a111adec49d04bb608dd0614ae94%7Cca2a7f76dbd74ec091086b3d524fb7c8%7C0%7C0%7C638673408085242099%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=by5mhPfSM0MFFG9LE2iiYjdtSs5IhvpuukqVv%2FLak2s%3D&reserved=0") In addition, some contexts may mandate or incentivise publication in some venues using measures including IF, but also requiring journals to be in certain databases like WoS or Scopus, or having preferred journal lists). I would suggest the authors should include in the Sampling section a rationale for taking this international approach, including any potentially confounding factors it may introduce, and then adding the latter also in the limitations.
-
Reporting of qualitative results: In the introduction and methods, the role of the focus groups and interviews seems to have been just to inform the design of the experiment. But then, results from that qualitative work then appear as direct quotes within the discussion to contextualise or explain results. In this sense though, the qualitative results are being used as new data. Given this, I feel that the methods section should include description of the methods and tools used for qualitative data analysis (currently it does not). But in addition, to my understanding (and this may be a question of disciplinary norms – I’m not a health/medicine researcher), generally new data should not be introduced in the discussion section of a research paper. Rather the discussion is meant to interpret, analyse, and provide context for the results that have already been presented. I personally hence feel that the paper would benefit from the qualitative results being reported separately within the results section.
-
Impact factors – Discussion section: While there is interesting new information on the relative trade-offs amongst other factors, the most emphasised finding, that impact factors still play a prominent role in publication venue decisions, is hardly surprising. More could perhaps be done to compare how the levels of importance reported here differ with previous results from other disciplines or over time (I know a like-for-like comparison is difficult but other studies have investigated these themes, e.g., https://doi.org/10.1177/01655515209585). In addition, beyond the question of whether impact factors are important, a more interesting question in my view is why they still persist. What are they used for and why are they still such important “driver[s] of researchers’ behaviour”? This was not the authors’ question, and they do provide some contextualisation by quoting their participants, but still I think they could do more to contextualise what is known from the literature on that to draw out the implications here. The attribute label in the methods for IF is “ranking”, but ranking according of what and for what? Not just average per-article citations in a journal over a given time frame. Rather, impact factors are used as a proxy indicators of less-tangible desirable qualities – certainly prestige (as the title of this article suggests), but also quality, trust (as reported by one quoted focus group member “I would never select a journal without an impact factor as I always publish in journals that I know and can trust that are not predatory”, p.6), journal visibility, importance to the field, or improved chances of downstream citations or uptake in news media/policy/industry etc. Picking apart the interactions of these various factors in researchers’ choices to make use of IFs (which is not in all cases bogus or unjustified) could add valuable context. I’d especially recommend engaging at least briefly with more work from Science and Technology Studies - especially Müller and de Rijcke’s excellent Thinking with Indicators study (doi: 10.1093/reseval/rvx023), but also those authors other work, as well as work from Ulrike Felt, Alex Rushforth (esp https://doi.org/10.1007/s11024-015-9274-5), Björn Hammerfelt and others.
-
Disciplinary coverage: (1) A lot of the STS work I talk about above emphasises epistemic diversity and the ways cultures of indicator use differ across disciplinary traditions. For this reason, I think it should be pointed out in the limitations that this is research in Health/Med only, with questions on generalisability to other fields. (2) Also, although the abstract and body of the article do make clear the disciplinary focus, the title does not. Hence, I believe the title should be slightly amended (e.g., “Health and Medical Researchers Are Willing to Trade …”)
-
-
This manuscript reports the results of an interesting discrete choice experiment designed to probe the values and interests that inform researchers’ decisions on where to publish their work.
Although I am not an expert in the design of discrete choice experiments, the methodology is well explained and the design of the study comes across as well considered, having been developed in a staged way to identify the most appropriate pairings of journal attributes to include.
The principal findings to my mind, well described in the abstract, include the observations that (1) researchers’ strongest preference was for journal impact factor and (2) that they were prepared to remove results from their papers if that would allow publication in a higher impact factor journal. The first of these is hardly surprising – and is consistent with a wide array of literature (and ongoing activism, e.g. through DORA, CoARA). The second is much more striking – and concerning for the research community (and its funders). This is the first time I have seen evidence for such a trade-off.
Overall, the manuscript is very clearly written. I have no major issues with the methods or results. However, I think but some minor revisions would enhance the clarity and utility of the paper.
First, although it is made clear in Table 1 that the researchers included in the study are all from the medical and clinical sciences, this is not apparent from the title or the abstract. I think both should be modified to reflect the nature of the sample. In my experience researchers in these fields are among those who feel most intensely the pressure to publish in high IF journals. The authors may want also to reflect in a revised manuscript how well their findings may transfer to other disciplines.
Second, in several places I felt the discussion of the results could be enriched by reference to papers in the recent literature that are missing from the bibliography. These include (1) Muller and De Rijcke’s 2017 paper on Thinking with Indicators, which discusses how the pressure of metrics impacts the conduct of research (https://doi.org/10.1093/reseval/rvx023); (2) Bjorn Brembs’ analysis of the reliability of research published in prestige science journals (https://www.frontiersin.org/journals/human-neuroscience/articles/10.3389/fnhum.2018.00376/full; and (3) McKiernan’s et al.’s examination of the use of the Journal Impact Factor in academic review, promotion, and tenure evaluations (https://pubmed.ncbi.nlm.nih.gov/31364991/).
Third, although the text and figures are nicely laid out, I would recommend using a smaller or different font for the figure legends to more easily distinguish them from body text.
-
Bohorquez, N. G., Weerasuriya, S., Brain, D., Senanayake, S., Kularatna, S., & Barnett, A. (2024, July 31). Researchers are willing to trade their results for journal prestige: results from a discrete choice experiment. https://doi.org/10.31219/osf.io/uwt3b
-
Aug 03, 2024
-
Nov 20, 2024
-
Authors:
- Natalia Gonzalez Bohorquez (Queensland University of Technology) natalia.gonzalezbohorquez@hdr.qut.edu.au
- Sucharitha Weerasuriya (Queensland University of Technology) sucharitha.weerasuriya@qut.edu.au
- David Brain (Queensland University of Technology) david.brain@qut.edu.au
- Sameera Senanayake (Duke-NUS Medical School) sameera.senanayake@duke-nus.edu.sg
- Sanjeewa Kularatna (Duke-NUS Medical School) sanjeewa.kularatna@duke-nus.edu.sg
- Adrian Barnett a.barnett@qut.edu.au
-
-
10.31219/osf.io/uwt3b
-
Researchers are willing to trade their results for journal prestige: results from a discrete choice experiment
-
- Dec 2024
-
arxiv.org arxiv.org
-
I started reading this paper with great interest, which flagged over time. As someone with extensive experience both publishing peer-reviewed research articles and working with publication data (Web of Science, Scopus, PubMed, PubMedCentral) I understand there are vagaries in the data because of how and when it was collected, and when certain policies and processes were implemented. For example, as an author starting in the late 1980s, we were instructed by the journal “guide to authors” to use only initials. My early papers were all only using initials. This changed in the mid-late 1990s. Another example, when working with NIH publications data, one knows dates like 1946 (how far back MedLine data go), 1996 (when PubMed was launched), and 2000 (when PubMedCentral was launched) and 2008 (when NIH Open Access policy enacted). There are also intermediate dates for changes in curation policy…. that underlie a transition from initials to full name in the biomedical literature.
I realize that the study covers all research disciplines, but still I am surprised that the authors of this paper don’t start with an examination of the policies underlying publications data, and only get to this at the end of a fairly torturous study.
As a reader, this reviewer felt pulled all over the place in this article and increasingly frustrated that this is a paper that explores the Dimensions database vagaries only and not really the core overall challenges of bibliometric data, irrespective of data source. Dimensions ingests data from multiple sources — so any analysis of its contents needs to examine those sources first.
A few specific comments:
-
The “history of science” portion of the paper focuses on English learned societies in the 17th century. There were many other learned societies across Europe, and also “papers” (books, treatises) from long before the 17th century in Middle-eastern and Asian countries (e.g, see history of mathematics, engineering, governance and policy, etc.). These other histories were not acknowledged by the authors. Research didn’t just spring full-formed out of Zeus’ head.
-
It is unclear throughout if the authors are referring to science, research, which disciplines are or are not included. The first chart on discipinary coverage is Fig 13 and goes back to 1940ish. Also, which languages are included in the analysis? For example, Figure 2 says “academic output” but from which academies? What countries? What languages? Disciplines? Also, in Figure 2, this reviewer would have like to see discussion about the variability in the noisiness of the data over time.
-
The inclusion of gender in the paper misses the mark for this reviewer. When dealing with initials, how can one identify gender? And when working in times/societies where women had to hide their identity to be published…. how can a name-based analysis of gender be applied? If this paper remains a study of the “initial era”, this reviewer recommends removing the gender analysis.
-
Reference needed for “It is just as important to see ourselves reflected in the outputs of the research careers…” (section B).
-
Reference needed for “This period marked the emergence of “Big Science” (Section B). How do we know this is Big Science? What is the relationship with the nature of science careers? Here it would be useful perhaps to mention that postdocs were virtually unheard of before Sputnik.
-
Fig 3. This would be more effective as a % total papers than absolute #.
-
Gradual Evolution of the Scholarly Record. This reviewer would like to see proportion of papers without authors. A lot of history of science research is available for this period, and a few references here would be welcome, as well as a by-country analysis (or acknowledgement that the data are largely from Europe and/or English-speaking countries).
-
Accelerated Changes in Recent Times. Again, this reviewer would like to see reference to scholarship on the history of science. One of the things happening in the post WW2 timeframe is the increase in government spending (in the US particularly) on R&D and academic research. So, is the academy changing or is it responding to “market forces”.
-
Reflective richness of data. “Evolution of the research community” is not described in the text, not is collaborative networks.
-
In the following paragraph, one could argue that evaluation was a driver of change, not a response to it. This reviewer would like to see references here.
-
II. Methodology. (i) 2nd sentence missing “to” “… and full form to refer to an author name…”. (ii) 2nd para the authors talk about epochs, but the data could be (are) discontinuous because of (a) curation policy, (b) curation technology, (c) data sources (e.g., Medline rolled out in the 1960s and back-populated to 1946). (iii) 4th para referes to Figs 3 and 4 showing a marked change between 1940 and 1950, but Fig 3 goes back only to 1960, and Fig 4 is so compressed it is hard to see anything in that time range. (iv) Para 7. “the active publishing community is a reasonable proxy for the global research population”. We need a reference here and more analysis. Is this Europe? English language? Which disciplines? All academia? Dimensions data? (v) Para 12 “In exploring the issue of gender…” see comments above. Gender is an important consideration but is out of scope, in this reviewer’s opinion, for this paper focused on use of initials vs. full name.
-
Listing 1. Is there a resolvable URL/DOI for this query?
-
Figs 9-11, 14, 15. This reviewer would like to see a more fulsome examination / discussion of data discontinuities. Particularly around ~1985-2000.
Discussion
-
The country-level discussion suggests the data (publications included) are only those that have been translated into English. Please clarify. Also, please add references in this section. There are a lot of bold statements, such as “A characteristic of these countries was the establishment of strong national academies.” Is this different from other places in the world? How? In the para before this statement, there is a phrase “picking out Slavonic stages” that is not clear to this reviewer.
-
The authors seem to get ahead of themselves talking about “formal” and “informal” in relation to whether initials or full names are used. And then discuss the “Power Distance” and end up arguing that it isn’t formal/informal … but rather publisher policies and curation practices driving the initial era and its end.
-
And then the authors come full circle on research articles being a technology, akin to a contract. Which is neat and useful. But all the intermediate data analysis is focused on the Dimensions data base and this reviewer would argue should be a part of the database documentation rather than a scholarly article.
-
This reviewer would prefer this paper be focused much more tightly on how publishing technology can and has driven the sociology of science. Dig more into the E. Journal Analysis and F. Technological analysis. Stick with what you have deep data for, and provide us readers with a practical and useful paper that maybe, just maybe, publishers will read and be incentivized to up their game with respect to adoption of “new” technologies like ORCID, DOIs for data, etc. Because these papers are not just expositions on a disciplinary discourse, they are also a window into how science (research) works and is done.
-
-
The presented preprint is a well-researched study on a relevant topic that could be of interest to a broad audience. The study's strengths include a well-structured and clearly presented methodology. The code and data used in the research are openly available on Figshare, in line with best practices for transparency. Furthermore, the findings are presented in a clear and organized manner, with visualization that aid understanding.
At the same time, I would like to draw your attention to a few points that could potentially improve the work.
-
I think it would be beneficial to expand the annotation to approximately 250 words.
-
The introduction starts with a very broad context, but the connection between this context and the object of the research is not immediately clear. There are few references in this section, making it difficult to determine whether the authors are citing others or their own findings.
-
The transition to the main topic of the study is not well-defined, and there is no description of the gap in the literature regarding the object of study. Additionally, "bibliometric archaeology" appears at the end of the introduction but is only mentioned again later in the discussion, which may cause confusion for the reader.
-
It would be helpful to clearly state the purpose and objectives of the study both in the Introduction and in the abstract as well.
-
Besides, it is important to elaborate on the contribution of this study in the introduction section.
-
The same applies to the background - a very broad context, but the connection with the object of the research is not entirely clear.
-
Page 4 - as far as I understand, these are conclusions from a literature review, while point 3 (Reflective Richness of Data) does not follow from the previous analysis.
-
The overall impression of the introduction and background is that it is an interesting text, but it is not well related to the objectives of the study. I would recommend shortening these sections by making the introduction and literature review more pragmatic and structured. At the same time, this text could be published as a standalone contribution.
-
As I mentioned above, the methodology refers to the strengths of the study. However, in this section, it would be helpful to introduce and justify the structure of presenting the results.
-
In the methodology section, the authors could also provide a footnote with a link to the code and dataset (currently, it is only given at the end).
-
With regard to the discussion, I would like to encourage the authors to place their results more clearly in the academic context. Ideally, references from the introduction and/or literature review would reappear in this section to help clarify the research contribution.
-
Although Discussion C is an interesting read, it seems more related to the introduction than the results. Again, the text itself is rather interesting, but it would benefit from a more thorough justification.
Remarks on the images:
-
At least the data source for the images should be specified in the background, because it is not obvious to the reader before describing the methodology.
-
The color distinction between China and Russia in Figure 8 is not very clear.
-
The gray lines in Figures 9-11 make the figures difficult to read. Additionally, the meaning of these lines is not clearly indicated in the legends of Figures 10 and 11. These issues should be addressed.
All comments and suggestions are intended to improve the article. Overall, I have a very positive impression of the work.
Sincere,
Dmitry Kochetkov
-
-
Overview
This manuscript provides an in-depth examination of the use of initials versus full names in academic publications over time, identifying what the authors term the "Initial Era" (1945-1980) as a period during which initials were predominantly used. The authors contextualize this within broader technological, cultural, and societal changes, leveraging a large dataset from the Dimensions database. This study contributes to the understanding of how bibliographic metadata reflects shifts in research culture.
Strengths
+ Novel concept and historical depth
The paper introduces a unique angle on the evolution of scholarly communication by focusing on the use of initials in author names. The concept of the "Initial Era" is original and well- defined, adding a historical dimension to the study of metadata that is often overlooked. The manuscript provides a compelling narrative that connects technological changes with shifts in academic culture.
+ Comprehensive dataset
The use of the Dimensions database, which includes over 144 million publications, lends significant weight to the findings. The authors effectively utilize this resource to provide both anecdotal and statistical analyses, giving the paper a broad scope. The differentiation between the anecdotal and statistical epochs helps clarify the limitations of the dataset and strengthens the authors' conclusions.
+ Cross-disciplinary relevance
The study's insights into the sociology of research, particularly the implications of name usage for gender and cultural representation, are highly relevant across multiple disciplines. The paper touches on issues of diversity, bias, and the visibility of researchers from different backgrounds, making it an important contribution to ongoing discussions about equity in academia.
+ Technological impact
The authors successfully connect the decline of the "Initial Era" to the rise of digital publishing technologies, such as Crossref, PubMed, and ORCID. This link between technological infrastructure and shifts in scholarly norms is a critical insight, showing how the adoption of new tools has real-world implications for academic practices.
Weaknesses
- Lack of clarity and readability
While the manuscript is rich in data and analysis, it can be dense and challenging to follow for readers not familiar with the technical details of bibliometric studies. The text occasionally delves into highly specific discussions that may be difficult for a broader audience to grasp while other concepts are introduced in cursory. Consider condensing the introduction section, removing unrelated historical accounts, and leading the audience to the key objectives of this research much earlier.
- Missing empirical case studies
The manuscript remains largely theoretical, relying heavily on data analysis without providing concrete case studies or empirical examples of how the "Initial Era" affected individual disciplines or researchers. A more detailed exploration of specific instances where the use of initials had significant consequences would make the findings more tangible. Incorporating case studies or anecdotes from the history of science that illustrate the real-world impacts of the trends identified in the data would enrich the paper. These examples could help ground the analysis in practical outcomes and demonstrate the relevance of the "Initial Era" to contemporary debates.
- Half-baked comparative analysis
Although the paper presents interesting data about different countries and disciplines, the comparative analysis between these groups could be further developed. For example, the reasons behind the differences in initial use between countries with different writing systems or academic cultures are not fully explored. A more in-depth comparative analysis that explains the cultural, linguistic, or institutional factors driving the observed differences in initial use would add nuance to the findings. This could involve a more detailed discussion of how non-Roman writing systems influence name formatting or how specific national academic policies shape author metadata.
- Limited discussion of alternative explanations
While the authors link the decline of the "Initial Era" to technological advancements, other potential explanations, such as changing editorial policies (“technological harmonisation”), shifts in academic prestige, or the influence of global collaboration, are not fully explored. The paper could benefit from a broader discussion of these factors. Expanding the discussion to include alternative explanations for the decline of initial use, and how these might interact with technological changes, would provide a more comprehensive view. Engaging with literature on academic publishing practices, editorial decisions, and global research trends could help contextualize the findings within a wider framework.
Conclusion
This manuscript offers a novel and insightful analysis of the evolution of name usage in academic publications, providing valuable contributions to the fields of bibliometrics, science studies, and research culture. With improvements in clarity, comparative analysis, and the incorporation of case studies, this paper has the potential to make a significant impact on our understanding of how metadata reflects broader societal and technological changes in academia. The authors are encouraged to refine their discussion and expand on the implications of their findings to make the manuscript more accessible and applicable to a wider audience.
-
Aug 14, 2024
-
Nov 20, 2024
-
Nov 20, 2024
-
Authors:
- Simon Porter (Digital Science) s.porter@digital-science.com
- Daniel Hook (Digital Science) d.hook@digital-science.com
-
2
-
10.48550/arXiv.2404.06500
-
The Rise and Fall of the Initial Era
-
-
-
This manuscript examines preprint review services and their role in the scholarly communications ecosystem. It seems quite thorough to me. In Table 1 they list many peer-review services that I was unaware of e.g. SciRate and Sinai Immunology Review Project.
To help elicit critical & confirmatory responses for this peer review report I am trialling Elsevier’s suggested “structured peer review” core questions, and treating this manuscript as a research article.
Introduction
-
Is the background and literature section up to date and appropriate for the topic?
Yes.
-
Are the primary (and secondary) objectives clearly stated at the end of the introduction?
No. Instead the authors have chosen to put the two research questions on page 6 in the methods section. I wonder if they ought to be moved into the introduction – the research questions are not methods in themselves. Might it be better to state the research questions first and then detail the methods one uses to address those questions afterwards? [as Elsevier’s structured template seems implicitly to prefer.
Methods
-
Are the study methods (including theory/applicability/modelling) reported in sufficient detail to allow for their replicability or reproducibility?
I note with approval that the version number of the software they used (ATLAS.ti) was given.
I note with approval that the underlying data is publicly archived under CC BY at figshare.
The Atlas.ti report data spreadsheet could do with some small improvement – the column headers are little cryptic e.g. “Nº ST “ and “ST” which I eventually deduced was Number of Schools of Thought and Schools of Thought (?)
Is there a rawer form of the data that could be deposited with which to evidence the work done? The Atlas.ti report spreadsheet seemed like it was downstream output data from Atlas.ti. What was the rawer input data entered into Atlas.ti? Can this be archived somewhere in case researchers want to reanalyse it using other tools and methods.
I note with disapproval that Atlas.ti is proprietary software which may hinder the reproducibility of this work. Nonetheless I acknowledge that Atlas.ti usage is somewhat ‘accepted’ in social sciences despite this issue.
I think the qualitative text analysis is a little vague and/or under-described: “Using ATLAS.ti Windows (version 23.0.8.0), we carried out a qualitative analysis of text from the relevant sites, assigning codes covering what they do and why they have chosen to do it that way.” That’s not enough detail. Perhaps an example or two could be given? Was inter-rater reliability performed when ‘assigning codes’ ? How do we know the ‘codes’ were assigned accurately?
-
Are statistical analyses, controls, sampling mechanism, and statistical reporting (e.g., P-values, CIs, effect sizes) appropriate and well described?
This is a descriptive study (and that’s fine) so there aren’t really any statistics on show here other than simple ‘counts’ (of Schools of Thought) in this manuscript. There are probably some statistical processes going on within the proprietary qualitative analysis of text done in ATLAS.ti but it is under described and so hard for me to evaluate.
Results
-
Is the results presentation, including the number of tables and figures, appropriate to best present the study findings?
Yes. However, I think a canonical URL to each service should be given. A URL is very useful for disambiguation, to confirm e.g. that the authors mean this Hypothesis (www.hypothes.is) and NOT this Hypothesis (www.hyp.io). I know exactly which Hypothesis is the one the authors are referring to but we cannot assume all readers are experts 😊
Optional suggestion: I wonder if the authors couldn’t present the table data in a slightly more visual and/or compact way? It’s not very visually appealing in its current state. Purely as an optional suggestion, to make the table more compact one could recode the answers given in one or more of the columns 2, 3 and 4 in the table e.g. "all disciplines = ⬤ , biomedical and life sciences = ▲, social sciences = ‡ , engineering and technology = † ". I note this would give more space in the table to print the URLs for each service that both reviewers have requested.
———————————————————————————————
| Service name | Developed by | Scientific disciplines | Types of outputs |
| Episciences | Other | ⬤ | blah blah blah. |
| Faculty Opinions | Individual researcher | ▲ | blah blah blah. |
| Red Team Market | Individual researcher | ‡ | blah blah blah. |
———————————————————————————————
The "Types of outputs" column might even lend themselves to mini-colour-pictograms (?) which could be more concise and more visually appealing? A table just of text, might be scientifically 'correct' but it is incredibly dull for readers, in my opinion.
-
Are additional sub-analyses or statistical measures needed (e.g., reporting of CIs, effect sizes, sensitivity analyses)?
No / Not applicable.
Discussion
-
Is the interpretation of results and study conclusions supported by the data and the study design?
Yes.
-
Have the authors clearly emphasized the limitations of their study/theory/methods/argument?
No. Perhaps a discussion of the linguistic/comprehension bias of the authors might be appropriate for this manuscript. What if there are ‘local’ or regional Chinese, Japanese, Indonesian or Arabic language preprint review services out there? Would this authorship team really be able to find them?
Additional points:
-
Perhaps the points made in this manuscript about financial sustainability (p24) are a little too pessimistic. I get it, there is merit to this argument, but there is also some significant investment going on there if you know where to look. Perhaps it might be worth citing some recent investments e.g. Gates -> PREreview (2024) https://content.prereview.org/prereview-welcomes-funding/ and Arcadia’s $4 million USD to COAR for the Notify Project which supports a range of preprint review communities including Peer Community In, Episciences, PREreview and Harvard Library. (source: https://coar-repositories.org/news-updates/coar-welcomes-significant-funding-for-the-notify-project/ )
-
Although I note they are mentioned, I think more needs to be written about the similarity and overlap between ‘overlay journals’ and preprint review services. Are these arguably not just two different terms for kinda the same thing? If you have Peer Community In which has it’s overlay component in the form of the Peer Community Journal, why not mention other overlay journals like Discrete Analysis and The Open Journal of Astrophysics. I think Peer Community In (& it’s PCJ) is the go-to example of the thin-ness of the line the separates (or doesn’t!) overlay journals and preprint review services. Some more exposition on this would be useful.
-
-
Thank you very much for the opportunity to review the preprint titled “Preprint review services: Disrupting the scholarly communication landscape?” (https://doi.org/10.31235/osf.io/8c6xm) The authors review services that facilitate peer review of preprints, primarily in the STEM (science, technology, engineering, and math) disciplines. They examine how these services operate and their role within the scholarly publishing ecosystem. Additionally, the authors discuss the potential benefits of these preprint peer review services, placing them in the context of tensions in the broader peer review reform movement. The discussions are organized according to four “schools of thought” in peer review reform, as outlined by Waltman et al. (2023), which provides a useful framework for analyzing the services. In terms of methodology, I believe the authors were thorough in their search for preprint review services, especially given that a systematic search might be impractical.
As I see it, the adoption of preprints and reforming peer review are key components of the move towards improving scholarly communication and open research. This article is a useful step along that journey, taking stock of current progress, with a discussion that illuminates possible paths forward. It is also well-structured and easy for me to follow. I believe it is a valuable contribution to the metaresearch literature.
On a high level, I believe the authors have made a reasonable case that preprint review services might make peer review more transparent and rewarding for all involved. Looking forward, I would like to see metaresearch which gathers further evidence that these benefits are truly being realised.
In this review, I will present some general points which merit further discussion or clarification to aid an uninitiated reader. Additionally, I raise one issue regarding how the authors framed the article and categorised preprint review services and the disciplines they serve. In my view, this problem does not fundamentally undermine the robust search, analyses, and discussion in this paper, but it risks putting off some researchers and constrains how broadly one should derive conclusions.
General comments
Some metaresearchers may be aware of preprints, but not all readers will be familiar with them. I suggest briefly defining what they are, how they work, and which types of research have benefited from preprints, similar to how “preprint review service” is clearly defined in the introduction.
Regarding Waltman et al.’s (2023) “Equity & Inclusion” school of thought, does it specifically aim for “balanced” representation by different groups as stated in this article? There is an important difference between “balanced” versus “equitable” representation, and I would like to see it addressed in this text.
Another analysis I would like to see is whether any of the 23 services reviewed present any evidence that their approach has improved research quality. For instance, the discussion on peer review efficiency and incentives states that there is currently “no hard evidence” that journals want to utilise reviews by Rapid Reviews: COVID-19, and that “not all journals are receptive” to partnerships. Are journals skeptical of whether preprint review services could improve research quality? Or might another dynamic be at work?
The authors cite Nguyen et al. (2015) and Okuzaki et al. (2019), stating that peer review is often “overloaded”. I would like to see a clearer explanation by what “overloaded” means in this context so that a reader does not have to read the two cited papers.
To the best of my understanding, one of the major sticking points in peer review reform is whether to anonymise reviewers and/or authors. Consequently, I appreciate the comprehensive discussion about this issue by the authors.
However, I am only partially convinced by the statement that double anonymity is “essentially incompatible” with preprint review. For example, there may be, as yet not fully explored, ways to publish anonymous preprints with (a) a notice that it has been submitted to, or is undergoing, peer review; and (b) that the authors will be revealed once peer review has been performed (e.g. at least one review has been published). This would avoid the issue of publishing only after review is concluded as is the case for Hypothesis and Peer Community In.
Additionally, the authors describe 13 services which aim to “balance transparency and protect reviewers’ interests”. This is a laudable goal, but I am concerned that framing this as a “balance” implies a binary choice, and that to have more of one, we must lose an equal amount of the other. Thinking only in terms of “balance” prevents creative, win-win solutions. Could a case be made for non-anonymity to be complemented by a reputation system for authors and reviewers? For example, major misconduct (e.g. retribution against a critical review) would be recorded in that system and dissuade bad actors. Something similar can already be seen in the reviewer evaluation system of CrowdPeer, which could plausibly be extended or modified to highlight misconduct.
I also note that misconduct and abusive behaviour already occur even in fully or partially anonymised peer review, and they are not limited to the review or preprints. While I am not aware of existing literature on this topic, academics’ fears seem reasonable. For example, there is at least anecdotal testimonies that a reviewer would deliberately reject a paper to retard the progress of a rival research group, while taking the ideas of that paper and beating their competitors to winning a grant. Or, a junior researcher might refrain from giving a negative review out of fear that the senior researcher whose work they are reviewing might retaliate. These fears, real or not, seem to play a part in the debates about if and how peer review should (or should not) be anonymised. I would like to see an exploration of whether de-anonimisation will improve or worsen this behaviour and in what contexts. And if such studies exist, it would be good to discuss them in this paper.
I found it interesting that almost all preprint review services claim to be complementary to, and not compete with, traditional journal-based peer review. The methodology described in this article cannot definitely explain what is going on, but I suspect there may be a connection between this aversion to compete with traditional journals, and (a) the skepticism of journals towards partnering with preprint review services and (b) the dearth of publisher-run options. I hypothesise that there is a power dynamic at play, where traditional publishers have a vested interest in maintaining the power they hold over scholarly communication, and that preprint review services stress their complementarity (instead of competitiveness) as a survival mechanism. This may be an avenue for further metaresearch.
To understand preprints from which fields of research are actually present on the services categorised under “all disciplines,” I used the Random Integer Set Generator by the Random.org true random number service (https://www.random.org/integer-sets/) to select five services for closer examination: Hypothesis, Peeriodicals, PubPeer, Qeios, and Researchers One. Of those, I observed that Hypothesis is an open source web annotation service that allows commenting on and discussion of any web page on the Internet regardless of whether it is research or preprints. Hypothesis has a sub-project named TRiP (Transparent Review in Preprints), which is their preprint review service in collaboration with Cold Spring Harbor Laboratory. It is unclear to me why the authors listed Hypothesis as the service name in Table 1 (and elsewhere) instead of TRiP (or other similar sub-projects). In addition, Hypothesis seems to be framed as a generic web annotation service that is used by some as a preprint review tool. This seems fundamentally different from others who are explicitly set up as preprint review services. This difference seems noteworthy to me.
To aid readers, I also suggest including hyperlinks to the 23 services reviewed in this paper. My comments on disciplinary representation in these services are elaborated further below.
One minor point of curiosity is that several services use an “automated tool” to select reviewers. It would be helpful to describe in this paper exactly what those tools are and how they work, or report situations where services do not explain it.
Lastly, what did the authors mean by “software heritage” in section 6? Are they referring to the organisation named Software Heritage (https://www.softwareheritage.org/) or something else? It is not clear to me how preprint reviews would be deposited in this context.
Respecting disciplinary and epistemic diversity
In the abstract and elsewhere in the article, the authors acknowledge that preprints are gaining momentum “in some fields” as a way to share “scientific” findings. After reading this article, I agree that preprint review services may disrupt publishing for research communities where preprints are in the process of being adopted or already normalised. However, I am less convinced that such disruption is occurring, or could occur, for scholarly publishing more generally.
I am particularly concerned about the casual conflation of “research” and “scientific research” in this article. Right from the start, it mentions how preprints allow sharing “new scientific findings” in the abstract, stating they “make scientific work available rapidly.” It also notes that preprints enable “scientific work to be accessed in a timely way not only by scientists, but also…” This framing implies that all “scholarly communication,” as mentioned in the title, is synonymous with “scientific communication.” Such language excludes researchers who do not typically identify their work as “scientific” research. Another example of this conflation appears in the caption for Figure 1, which outlines potential benefits of preprint review services. Here, “users” are defined as “scientists, policymakers, journalists, and citizens in general.” But what about researchers and scholars who do not see themselves as “scientists”?
Similarly, the authors describe the 23 preprint review services using six categories, one of which is “scientific discipline”. One of those disciplines is called “humanities” in the text, and Table 1 lists it as a discipline for Science Open Reviewed. Do the authors consider “humanities” to be a “scientific” discipline? If so, I think that needs to be justified with very strong evidence.
Additionally, Waltman et al.’s four schools of thought for peer review reform works well with the 23 services analysed. However, at least three out of the four are explicitly described as improving “scientific” research.
Related to the above are how the five “scientific disciplines” are described as the “usual organisation” of the scholarly communication landscape. On what basis should they be considered “usual”? In this formulation, research in literature, history, music, philosophy, and many other subjects would all be lumped together into the “humanities”, which sit at the same hierarchical level as “biomedical and life sciences”, arguably a much more specific discipline. My point is not to argue for a specific organisation of research disciplines, but to highlight a key epistemic assumption underlying the whole paper that comes across as very STEM-centric (science, technology, engineering, and math).
How might this part of the methodology affect the categories presented in Table 1? “Biomedical and life sciences” appear to be overrepresented compared to other “disciplines”. I’d like to see a discussion that examines this pattern, and considers why preprint review services (or maybe even preprints more generally) appear to cover mostly the biomedical or physical sciences.
In addition, there are 12 services described as serving “all disciplines”. I believe this paper can be improved by at least a qualitative assessment of the diversity of disciplines actually represented on those services. Because it is reported that many of these service stress improving the “reproducibility” of research, I suspect most of them serve disciplines which rely on experimental science.
I randomly selected five services for closer examination, as mentioned above. Of those, only Qeios has demonstrated an attempt to at least split “arts and humanities” into subfields. The others either don’t have such categories altogether, or have a clear focus on a few disciplines (e.g. life sciences for Hypothesis/TRiP). In all cases I studied, there is a heavy focus on STEM subjects, especially biology or medical research. However, they are all categorised by the authors as serving “all disciplines”.
If preprint review services originate from, or mostly serve, a narrow range of STEM disciplines (especially experiment-based ones), it would be worth examining why that is the case, and whether preprints and reviews of them could (or could not) serve other disciplines and epistemologies.
It is postulated that preprint review services might “disrupt the scholarly communication landscape in a more radical way”. Considering the problematic language I observed, what about fields of research where peer-reviewed journal publications are not the primary form of communication? Would preprint review services disrupt their scholarly communications?
To be clear, my concern is not just the conflation of language in a linguistic sense but rather inequitable epistemic power. I worry that this conflation would (a) exclude, minoritise, and alienate researchers of diverse disciplines from engaging with metaresearch; and (b) blind us from a clear pattern in these 23 services, that is their strong focus on the life sciences and medical research and a discussion of why that might be the case. Critically, what message are we sending to, for example, a researcher of 18th century French poetry with the language and framing of this paper? I believe the way “disciplines” are currently presented here poses a real risk of devaluing and minoritising certain subject areas and ways of knowing. In its current form, I believe that while this paper is a very valuable contribution, one should not derive from it any conclusions which apply to scholarly publishing as a whole.
The authors have demonstrated inclusive language elsewhere. For example, they have consciously avoided “peer” when discussing preprint review services, clearly contrasting them to “journal-based peer review”. Therefore, I respectfully suggest that similar sensitivity be adopted to avoid treating “scientific research” and “research” as the same thing. A discussion, or reference to existing works, on the disciplinary skew of preprints (and reviews of them) would also add to the intellectual rigour of this already excellent piece.
Overall, I believe this paper is a valuable reflection on the state of preprints and services which review them. Addressing the points I raised, especially the use of more inclusive language with regards to disciplinary diversity, would further elevate its usefulness in the metaresearch discourse. Thank you again for the chance to review.
Signed:
Dr Pen-Yuan Hsing (ORCID ID: 0000-0002-5394-879X)
University of Bristol, United Kingdom
Data availability
I have checked the associated dataset, but still suggest including hyperlinks to the 23 services analysed in the main text of this paper.
Competing interests
No competing interests are declared by me as reviewer.
-
Henriques, S. O., Rzayeva, N., Pinfield, S., & Waltman, L. (2023, October 13). Preprint review services: Disrupting the scholarly communication landscape?. https://doi.org/10.31235/osf.io/8c6xm
-
Aug 11, 2024
-
Nov 20, 2024
-
Nov 20, 2024
-
Authors:
- Susana Henriques (Research on Research Institute (RoRI) Centre for Science and Technology Studies (CWTS), Leiden University, Leiden, the Netherlands Scientific Research Department, Azerbaijan University of Architecture and Construction, Baku, Azerbaijan) s.oliveira@cwts.leidenuniv.nl
- Narmin Rzayeva (Research on Research Institute (RoRI) Information School, University of Sheffield, Sheffield, UK) n.rzayeva@cwts.leidenuniv.nl
- Stephen Pinfield (Research on Research Institute (RoRI) Centre for Science and Technology Studies (CWTS), Leiden University, Leiden, the Netherlands) s.pinfield@sheffield.ac.uk
- Ludo Waltman waltmanlr@cwts.leidenuniv.nl
-
7
-
10.31235/osf.io/8c6xm
-
Preprint review services: Disrupting the scholarly communication landscape
-
- Nov 2024
-
arxiv.org arxiv.org
-
Editorial Assessment
This article presents a large-scale data-driven analysis of the use of initials versus full first names in the author lists of scientific publications, focusing on changes over time in the use of initials. The article has been reviewed by three reviewers. The originality of the research and the large-scale data analysis are considered strengths of the article. A weakness is the clarity, readability, and focus of certain parts of the article, in particular the introduction and background sections. In addition, the reviewers point out that the discussion section can be improved and deepened. The reviewers also suggest opportunities for strengthening or extending the article. This includes adding case studies, extending the comparative analysis, and providing more in-depth analyses of changes over time in policies, technologies, and data sources. Finally, while reviewer 2 is critical about the gender analysis, reviewer 3 considers this analysis to be a strength of the article.
-
-
-
Editorial Assessment
The authors present a descriptive analysis of preprint review services. The analysis focuses on the services’ relative characteristics and differences in preprint review management. The authors conclude that such services have the potential to improve the traditional peer review process. Two metaresearchers reviewed the article. They note that the background section and literature review are current and appropriate, the methods used to search for preprint servers are generally sound and sufficiently detailed to allow for reproduction, and the discussion related to anonymizing articles and reviews during the review process is useful. The reviewers also offered suggestions for improvement. They point to terminology that could be clarified. They suggest adding URLs for each of the 23 services included in the study. Other suggestions include explaining why overlay journals were excluded, clarifying the limitation related to including only English-language platforms, archiving rawer input data to improve reproducibility, adding details related to the qualitative text analysis, discussing any existing empirical evidence about misconduct as it relates to different models of peer review, and improving field inclusiveness by avoiding conflation of “research” and “scientific research.”
The reviewers and I agree that the article is a valuable contribution to the metaresearch literature related to peer review processes.
Handling Editor: Kathryn Zeiler
Competing interest: I am co-Editor-in-Chief of MetaROR working with Ludo Waltman, a co-author of the article and co-Editor-in-Chief of MetaROR
-