Strengthening Integrity and Fairness in Peer Review Through New Required Trainings

Posted

Effective for the May 2024 council round (peer review meetings in early 2024), all reviewers will be required to complete trainings related to review integrity and bias awareness prior to serving on NIH peer review groups (NOT-OD-23-156). These trainings build on our long-standing commitment to maintaining integrity and fairness throughout the review process. The NIH Center for Scientific Review (CSR) developed the two interactive online training modules with significant input from dedicated CSR advisory council working groups.

The “Bias Awareness and Mitigation” module (launched in 2021) is designed to raise reviewer awareness of potential sources of bias in review of grant applications and help reviewers take action to mitigate bias. Rather than addressing implicit bias generally, the training is uniquely targeted towards mitigating biases we have observed in peer review. The training is one of a number of CSR initiatives to ensure that the review process is fair and unbiased.

The “Review Integrity” module (launched in 2022) is designed to increase reviewer knowledge and awareness of review integrity throughout the NIH peer review process and provide reviewers with tools to prevent and report integrity breaches. Ultimately, the ability of peer review to identify the highest impact science depends on the integrity of the process. In addition, maintaining a peer review process that is free from inappropriate influences is important for maintaining public trust in science. The critical role that reviewers play in protecting the integrity of the process is reiterated in NIH Guide Notice NOT-OD-22-044. See also these case studies in peer review integrity.

CSR has successfully implemented both training modules for multiple review cycles. Over 20,000 reviewers have taken the bias awareness training and over 14,000 have completed the review integrity training. Survey results show that over 90 percent of reviewers reported the training modules were effective and they felt better prepared to take action afterwards. In particular, they noted their:

  • Ability to identify bias in peer review was substantially increased, and they were more comfortable intervening against bias (N>3,000 survey respondents)
  • Knowledge of tools to prevent and report review integrity breaches was substantially improved, and they were more comfortable contacting NIH with concerns (N>6,000 survey respondents)

Here is what reviewers should know going forward.

Effective for the May 2024 council round (peer review meetings in early 2024), reviewers who have not already completed the trainings will receive an email invitation with the link to complete the trainings. Reviewers will not be able to access their assigned applications/proposals until the trainings are completed.

Each 30-minute training was designed to be effective while minimizing demands on reviewer time; the trainings utilize real and relevant examples, providing practical strategies. Reviewers will be required to retake the trainings every three years and content will be updated on the same schedule. Our system records the date of training completion for each reviewer; reviewers who have already completed these trainings will not need to complete them again when the requirement goes into effect in the May 2024 council round.

We appreciate your time and effort in completing these trainings. Together with your support and participation, we can maintain the integrity and fairness of the peer review process.  

Reviewers may contact the designated NIH scientific review officer for their study section with any questions.

41 Comments

  1. My sense is that CSR already suffers from a lack of established investigators serving on its panels. Adding this requirement, I suspect, will further disincline these folks to serve.

    1. As a female PI with over 30 years of experience when you get reviewer comments like (and I’m quoting verbatim) “I’m not sure she’s qualified to be the PI nor this proposal” when the focus of the application is in your primary area of expertise supported by dozens of pubs, I support this. I’m not particularly optimistic it will change anything, but it’s better than nothing.

    2. I agree. My life has become consumed with “training” courses, most of which are a complete waste of time. It is impossible to teach a one-size-fits-all course to students without either boring or losing most of the students. It is similarly impossible to do so in these one-size-fits-all trainings.

  2. Training on policies is one thing. Enforcement of those policies is another. Reviewers will continue to favor their friends (who are technically not in conflict with them), and CSR staff will continue to ignore how these personal relationships introduce bias into the reviews.

  3. Are these trainings publicly available? The material might be useful in other contexts (e.g., journal reviewing).

  4. Imagine this was presented as a clinical trial intervention to reduce bias and increase integrity. We would be required to have a study design (other than a one-arm, open label) with appropriate controls. There would be inclusion/exclusion criteria, strategies for dropouts, sampling criteria. Then of course there would be measurement of outcomes, at appropriate endpoints, and an analytic plan. It would have to be listed on ClinicalTrials.gov, and a data sharing plan delineated. Risks, benefit etc. would have to be evaluated. How will NIH ever know if this (and so many similar policies) has been effective? It would never get past triage as a proposal, even with reviewers who have had the new training.

      1. A change in the number of female and BIPOC investigators could be due to a reduction in bias, or an increase in bias. A better metric would be to divide the number of such people awarded funding by the number applying.

        This still isn’t a good metric, because past affirmative action has led to a bias in the proficiency of advantaged groups. For instance, Amazon developed an AI system to choose resumes of people it judged more-likely to eventually score well on internal company metrics of performance. They couldn’t stop the AI from finding ways to favor men over women, even after they removed all identifying information. The problem was that they’d already been trying for years to achieve a 50/50 balance of men and women in engineering positions, despite the small fraction of women in the applicant pool; so their historical data about their own employees of course showed that the men they hired performed, on average, better than the women they hired. So you can’t take data from an organization that’s been incorporating a deliberate bias into its choices, and take that as an unbiased sample.

        It seems to me that the obvious way to reduce such bias is to remove all identifying information about the investigator, including name, gender, ethnicity, and UNIVERSITY ATTENDED. The reason for knowing the name is to see what research they’ve done in the past, but IMHO, giving rewards mainly to people who’ve gotten awards, & hence publications, in the past, is the most-pernicious bias in the system. I’m skeptical that there’s even a positive correlation between past publications and quality of future work; the regular progression is that investigators have less and less involvement in each project the more awards they win.

  5. This is very relevant ! I commend NIH and CSR’s willingness to do this. While on study sections, I have felt many reviewers generally assume that good science only gets done at big name schools (on the east and west coast).

    Personally, I’ve seen mediocre science from top institutions getting good scores and good science from lesser known schools not getting scored. I’ve also seen reviewers openly make a mockery of reviewers’ institutions – In one instance I remember vividly when one senior reviewer from an IVY League school on the study section who said something like .. “this PI is from Miami University of Oxford ..” and he went on very clearly to make a mockery of the fact that that university is not University of Miami and it is also not University of Oxford. The SRO said nothing in response.

    1. I agree with “AA” and others comments. Online training modules are ineffective unless the reviewer is to speak up against bias during the review process. And that is never a guarantee. Why should they risk their career? NIH needs to do better.

  6. I agree with Matthew Smith. In addition, SROs will continue to ignore any conflict warnings. There needs to be a policy where if a reviewer has consistently given bad scores to a PI, and conflict exists in other ways (like possible rivals), SROs will have to be held responsible for continuing to get the same reviewer.

    1. I absolutely agree with this comment. There are so many reviewers who appear to be in the study section (IRG) only to “kill” grants who are not their friends or their back rubbers. Unless the SROs are made more responsible to weed out biased and select more responsible reviewers who take their work more astuteness this peer-review system is not going to improve.

  7. This sounds like a training I dutifully plodded through towards the end of my final (2nd) “tour of duty” as a regular member and evokes PTSD both about the content and about the ways in which CSR (and probably Building 1) progressively degrade and undermine the whole process, in part through some degrees of willed denial. As another comment already notes, this sort of thing is not going to enhance recruitment and retention of really good reviewers and best scientists.
    [The whole thing is another example of what can be captured in a silly gag part of the kids’ movie “Johnny English”, but here I will paraphrase to say that the tattoo reads “Congress is coming; look busy”. In other words, the notion that Study Sections are really selecting the best science or at this point and with the paylines as they are could do so -completely lacks rigor or reproducibility. But it probably is good for maintaining Congressional support.]

    1. It appears that there is a directive from the upper management at CSR to limit the use of more seasoned reviewers. These trainings will have very limited impact on who will serve on study sections in the future.

    1. A researchers record of contributing to science is critical in evaluating a grant. The proposal itself should be worth 2/3 the score at best.

      1. I concur with this sentiment. I have observed instances where individuals who excel at establishing professional connections yet consistently refrain from publishing their work receive funding and recognition. Simultaneously, those who actively publish their findings and conduct valuable scientific research often go unrecognized. To rectify this situation, I believe that greater importance should be placed on a researcher’s tangible contributions to the field of science, such as their citation count/NIH-USD, h-index, and publication record. I support assigning two-thirds of the weight to the proposal, evaluating its scientific merit, feasibility, and potential impact, and assigning the remaining one-third to the investigator’s track record and demonstrated ability to conduct successful research. This balanced approach ensures that both the quality of the proposed research and the researcher’s previous achievements are considered when making funding and promotion decisions.

      2. I don’t think so. The problem is that this is a self-perpetuating bias–the more awards you win, the likelier you are to win future awards. If you look at the distribution of how many publications different scientists have published, and model the probability of any scientist publishing another paper as a linear sum of some multiple of (A) their inherent ability, and (B) the number of publications they’ve got, and do a regression, the regression comes up with A = 0. That is, the current distribution of publications per scientist is best explained as a random self-accretionary process, having nothing to do with the ability of the scientists.

    2. I completely agree to what you said. Reviewer must get anonymous geant application. Only then you can expect a fair review.

  8. CSR is to be commended for attempting to reduce bias and increase integrity of the review process. Regarding the comment of anonymizing grants, a study was conducted to see if anonymization actually worked. As it turned out, a knowledgeable reviewer would recognize the work because the reviewer would be familiar with the field. The only way that anonymization would work is if reviewers without appropriate expertise were used – not a desirable course. Indeed, closer supervision of SROs to ensure that appropriate expertise is assigned to review is needed. Training is how to review would be beneficial as well. Too many novice reviewers seem to think their job is to find fault and focus on methodological details, as opposed to seeing the ‘big picture’ and how a proposal can add to a field. Some time ago, guidelines for review included statements that senior investigators with track records need not provide experimental details. This guideline appears not to be observed anymore. Also, reviewers need to be instructed that a grant need not be perfect to receive a score of ‘1’. There used to be a rubric for scores and criteria that was distributed for every study section session. I have not seen that rubric distributed in many years. Please bring it back.

  9. Scientific competence and demonstrated capacity for innovation not just fairness and lack of bias is direly needed.

    Reviewers must pass a certification of contemporary knowledge of a field to then serve as ‘experts’ on CSR to demonstrate competence. Just like MDs periodically have to repass their boards to continue to function as licensed medical practitioners.

    It is not as much as about experience, it’s about the ability to constantly learn new material that takes serious intellectual effort. Intellectual laziness must not be reward as it is currently and leads to all the bias one sees. Innovation by its very nature doesn’t have a past.

    Similarly, it’s not as much as how many papers one has published, but what was the impact of that work. The greater the publications of a reviewer or even an applicant, the less likely they are innovative or have the capacity to judge innovation.

    The Ingenious Program Director: Dr. Laurie Tompkins (retired)
    From NIGMS
    Who Truly Nurtured Innovation

    To quote her:
    “The challenge, says Tompkins, will be to train reviewers to think very differently. “Human nature goes for the safe thing that will give you results for sure.”
    The proposals will be assessed for their likelihood of success, but reviewers are only supposed to discard the proposals that have “zero chance of success,” says Tompkins.” This is a major departure from the usual way grants are reviewed, which many scientists have criticized. ”

    Bring back the “Dr. Tompkins” of NIH and let them guide the next generation of NIH-funded science and select the reviewers accordingly. It takes two to tango and promote creativity and innovation!

    1. This is one of the best Comments I have read (Rene Anand)…… Dr Tompkins said it all: Scientific competence, innovation, keeping up with the field, and
      evidence of substantial contributions and to progress ib a field. Some of the attributes come only with age.

  10. Hey. I have not been asked to complete the Integrity and Fairness in Peer Review Required Trainings. Can you please check my status in the system and update if necessary. Thanks a lot. John

  11. This is definitely a step in the right direction. But it will not have an effect on obtaining mature, highly-qualified
    reviewers for the panel. It has been my experience that young reviewers are insecure sitting around a panel table and
    often speak to impress the other reviewers, rather than directly to the proposal under review. How does one insure a
    supply of good reviewers? It should be a rule that if you have received more than one NIH grant that you are obligated to
    sit on a review panel if asked to do so. It is a thankless but important job, and, like the draft, service must sometimes be
    required.

  12. Anytime I read comments on Summary Statements, I wonder whether some of the reviewers actually read the grant. Whereas some of the comments are fair, others are grossly biased. Some reviewers are not even conversant with the relevant literature. A recent comment on my grant stated the following:
    Strength: X University is ranked 326 among all funding institutions of USA (total 2849)
    Weaknesses: X University is not a research-intensive institution.
    These comments clearly demonstrate bias. Having served on a study section for 10 yrs (reviewed R21, R03, R01 grants), I found such comments unacceptable.

  13. I find myself identifying with so many points made in the comments of others (summarized below). As a less-known PI at a small institution, but with reasonable success at the NSF and a reasonably productive career that has brought a lot of new science to the field, I must wonder how the NIH will really ensure fairness. On my last R01 I found the “kill” proposal at any cost reflected in the reviews. I am waiting to see what the comments are on the R35 (NIGMS experiment, where EIs and NIs were lumped together!). Although the R35 appeared to be a move to be “more inclusive”, I think it is only likely to reward bigger names and bigger institutions, “friends of reviewers (not in conflict)”, continue to perpetuate the “old boy” network, etc., as usual. When I started my career many years ago, a senior colleague advised that in order to have any chance of success at the NIH, I should invite as many potential reviewers from the IRG to give talks – really! Now contrast that with successful applications with the NSF where reviewers are not disclosed. Clearly, NIH needs to do some serious reevaluation of a broken peer review system that is skewed to favor of some, whether it is single- or double-blind reviews (although word can still get around in these cases), or some other mechanism. As a colleague at a smaller institution mentioned and I paraphrase, reviewers may feel that PIs at smaller institutions have smaller brains. I seriously doubt that any additional training is going to really alter anything, it is going to be wasted time for reviewers whose minds may already be made up!
    Previous comments that resonate:
    (a) Reviewers will continue to favor their friends (who are technically not in conflict with them), and CSR staff will continue to ignore how these personal relationships introduce bias into the reviews. (b) While on study sections, I have felt many reviewers generally assume that good science only gets done at big name schools (on the east and west coast). (c) Too many novice reviewers seem to think their job is to find fault and focus on methodological details, as opposed to seeing the ‘big picture’ and how a proposal can add to a field. Also, reviewers need to be instructed that a grant need not be perfect to receive a score of ‘1’. (d) If you keep the same system instead of anonymizing the grants, this is pointless and the system will still be unfair. (e) There are so many reviewers who appear to be in the study section (IRG) only to “kill” grants who are not their friends or their back rubbers. Unless the SROs are made more responsible to weed out biased and select more responsible reviewers who take their work more astuteness this peer-review system is not going to improve. (f) Personally, I’ve seen mediocre science from top institutions getting good scores and good science from lesser known schools not getting scored. (g) How will NIH ever know if this (and so many similar policies) has been effective? It would never get past triage as a proposal, even with reviewers who have had the new training. (h) My life has become consumed with “training” courses, most of which are a complete waste of time.

    1. “I think it is only likely to reward bigger names and bigger institutions, “friends of reviewers (not in conflict)”, continue to perpetuate the “old boy” network, etc.”

      The mechanism not necessarily involves anti-anything bias. Reviewers are seldom competent enough in the subject of a given proposal. Being more or less unable to judge an application on its merit, the reviewers fall back to the “reputation” factor, assuming that PIs from bigger institutions know what they are doing.

  14. From all I have read on this blog site, and what others have privately told me, the NIH peer review system is BROKEN, BROKEN, BROKEN!!! But, NIH officials will never admit this or make drastically needed reforms to it, because it reflects poorly on their leadership. They are content waiting it out and collecting FERS.

  15. This requirement is the final straw in my willingness to serve on a study section. I will decline all requests in future and will cite this new mandate. The reactions of a few professional friends suggest I won’t be alone in my decision. The study section system requires us all to volunteer large amounts of our time (the fees are minimal, and not why we serve). And now we must be lectured at to be qualified to take part? Yes, there are examples of a few bad apples acting inappropriately when on study sections, but the answer is not to treat the vast majority of honest, unbiased scientists as if we were problems that need to be fixed. Frankly, it’s all as insulting as it is surely pointless – these “training” course are invariably an empty ritual to be clicked through as soon as possible – and we suffer enough of them at institutional level. Enough already… The NIH needs to treat honest, hard working professionals with respect, and accept that we already understand what integrity means and why bias is bad.

  16. Here are two other considerations that impact study section reviews (and journal reviews, too), namely H. sapiens, and peer review, which on the whole promotes mediocrity via regression to the mean (good for filtering out the truly weak or flawed proposals, rather less effective in appreciating very innovative ones).
    But as many have commented elsewhere, what is the alternative to peer review? Nevertheless, in my experience, the majority of (massively administratively overburdened) study section members try their imperfect best. And SROs do not have an easy job.
    It remains the case however, that some members are not well educated in basic scientific or logical principles, which can lead to some interesting comments. so additional training modules are needed (or instead of)?

  17. I’m hoping this training includes a requirement for reviewers to read any updated application instructions prior to reviewing applications. That way, a reviewer isn’t using ancient instructions to unfairly denigrate an application that is actually following current requirements.

  18. I believe NIH objectives are laudable. Although grant reviewers are hard to find, I believe the grant reviewers must not have substantial conflicts of interest, whether financial, professional, or intellectual, that would impair the scientific integrity of the review process. These reviewers must be willing to regularly complete conflict of interest disclosures. This challenge also rests with the SRO. SROs must conduct due diligence in identifying such conflicts if the grant reviewers are unwilling to present such information. For example, there is a well-documented historic tension that exists between physicians and healthcare insurers. If a grant reviewer receives substantial funding from either physician-group or a healthcare insurer group, he/she may have a financial, a professional, or an intellectual conflict – reviewing projects proposed to test health-systems effects of current regulations on patient outcomes. This is not a theoretical argument. Hopefully, NIH’s grant review improvements will be passed on to AHRQ.

  19. The peer review process is, in my opinion, generally not meritorious, and this may be responsible for the majority of irregularities the PIs are complaining about. My recent experience from a review panel is a solid proof of what many PIs (including myself) already suspected. Some specifics, without compromising the confidentiality of the particular application (I will be happy to share the details with NIH staff, of course): A proposal involving a generative peptide/protein design approach was assigned to a primary reviewer who was specializing in clinical pharmacology – far removed from the subject of the proposal. In fact, the Reviewer (a native speaker) was repeatedly stumbling on the word “macrocyclic”, while reading his summary! The lack of relevant expertise, or the integrity to recuse himself, resulted in a “spillover” – looking for every and any pretext to find something against the proposal, including declaring the supporting data (good enough for a “Science” paper) as inadequate. The only thing I was allowed to do about it was to submit a separate opinion to be included in the Summary Statement, but the panel was already prejudiced. This example is a particularly grotesque one, but consistent with what I have been seeing in critiques of my own applications.

    The very first things to do to improve the integrity of peer review are:
    1. SROs must take some time to find competent reviewers. This is not really hard – find a few related publications, email the authors, ask if they feel competent to serve as reviewers. Approx. 10 minutes per application.
    2. Allow panel members to challenge non-meritorious critiques, and request re-review. Without waiting for PI’s appeals (usually dismissed anyway, as “differences of scientific opinions”, “amend and re-submit”, blah, blah).

    First, make the review process meritorious. That’s all.

    1. I think you are grossly underestimating the time it would take to find qualified reviewers. Many experts won’t take time away from their own work to serve as reviewers.

  20. Mike Lauer seems saying the training help CSR reviewers identify and mitigate their own biases and integrity breaches, with the assumption that CSR reviewers did their flawed and biased reviews unintentionally, need some training to help, so they know how to do a fair scientific review based on scientific merits. Really?!!! My piled-up CSR flawed and biased summary statements show CSR reviewers actually did it intentionally, even violated CSR current review guidelines. It is with serious doubt that the additional CSR training would really mitigate such biases and integrity breaches or do a self-clean. Why is that the applicant has no say in it? We all know the honor system is broken.

    1. Thank you for voicing your concerns. We encourage you to reach out to your SRO if you have these concerns for your application’s review.

      1. SRO’s DON’T CARE. Don’t you get it? SROs don’t want to rock the boat. They don’t want to talk to their chosen reviewers about the uncomfortable truth that the review was inaccurate or biased. I have had reviews with factual inaccuracies, talked to SROs about it, but every time they only say “try again” and “respond to reviews”. How can you respond to an obviously biased review that rests on factual inaccuracies to kill your application?

      2. I’ve had conversations with SROs leading nowhere! Sorry. They just want to shut the conversation down. The inadequacies of reviewers who do not understand a field is quite glaring in the non-instructive negative comments.

  21. Following up on my prior comments, prior to looking at the MIRA reviews. A lot of what has been stated above is clearly borne out: (a) reviewers without knowledge of the field who believe that what works foA (known) should also work for Z (entirely different field and unknown), and making supposedly “knowledgeable” comments, (b) the lack of expertise leads to not appreciating difficulties in specific areas and what any advance will mean, thus, leading to inability to address the impact of the work in such a context, (c) comments that appear to want to “kill” the proposal (how many applications from their networks, not in conflict, were reviewed there is an interesting question), (d) not understanding that in the 6-pages of a MIRA application, PIs are supposed to indicate the various areas of research for funding (one reviewer seemed to completely miss this point).

    In summary, no amount of anti-bias training is going to alter these outcomes. Unless the NIH is seriously interested in revisiting its proposal review system, the implicit and explicit biases will exist, although reviewers will grind through the training sessions. Networks, back rubbing, thoughts about the inabilities of small programs will all continue to factor into the review system.

    With this, I have to seriously wonder whether I ever want to review for the NIH (although I have done so on several IRGs), because the altruism of some panelists may not be reflected in the final fates of applications!

  22. I have no idea whether this training will be useful or a waste of time, but it is dearly needed. Some of the reviews I have received indicate that reviewers only approve of research like their own done by institutions like their own. In other cases they seem to be protecting the field from likely competitors. These are very human but ultimately petty motivations. And of course, there is always the obsession with methodological details that can’t possibly be provided given space limitations. I wish reviewers would ask themselves whether they can fairly review a proposal and if the answer is no, decline. It is the right thing to do.

Before submitting your comment, please review our blog comment policies.

Leave a Reply

Your email address will not be published. Required fields are marked *