44 Comments
There has been much talk on the peer review process in our blog comments recently. We are extremely proud of our peer review system and its acceptance as one of the premier systems of peer review around the world. Our peer review system is based on a partnership between NIH staff and our reviewers. Each year we review about 80,000 applications with the help of tens of thousands of outside experts. I greatly appreciate those who have served and encourage everyone to participate in the process when you can. If you are interested in hearing more about how we recruit reviewers listen to our latest podcast.
Both reviewers and NIH staff follow the policies and procedures that are in place to ensure that every application receives a fair, equitable, timely and unbiased review. We have reviewer conflict of interest policies that prevent an individual from reviewing a friend’s grant, one from a collaborator, or from another investigator at their institution, for example. And while we assign at least three reviewers per application, the entire study section can view the application, and everyone participates in the discussion and scores the application (excluding those who are in conflict). We also have an appeals process in place that allows you to contest the review outcome if you feel that the process used to review your application was flawed.
Our peer review process is rigorous and time-tested, but no process involving humans will ever be perfect. As you know, we went through many changes in the process over the past couple years, based on an evaluation done by the Advisory Committee to the Director and continue to tweak the system. We are committed to continuous review of our peer review system because we know the system should evolve as the science evolves, and there are always ways to make it better. The last thing we want to do is to wait another 20 years to examine peer review again, so we are collecting data and information that will give us adequate metrics and baselines to inform us on how peer review is working and how it could possibly be re-engineered for the better.
However good and fair the peer review process is, the reality is that we receive far more high quality applications than we could ever support. It is vital that we at NIH continue our efforts to look for the best ways to approach funding that will sustain scientific progress and provide the most benefit to the health of the nation.
Comment removed at the request of the submitter 7/21/11 — Rock Talk Blog Team
I would be fascinated to know why micromike feels he or she has a representative sample and on what basis the assessment of “bias” and “failure” has been made.
Comment removed at the request of the submitter 7/21/11 — Rock Talk Blog Team
Comment removed at the request of the submitter 7/21/11 — Rock Talk Blog Team
It is very difficult to become an NIH-funded researcher if you spend all your time writing lengthy destructive comments in every single blog.
I do not understand your problem with NIH yet, although I made an effort to finish that fable that you wrote in this journal of negative results. What I understood very clearly is the xenophobic nature of that piece, and it is embarrassing that anybody could have published that.
End of the conversation. I have to work on a grant.
Thanks for the discussion but let’s keep the comments on topic please.
You missed my third blog at Scientopia, btw.
My point remains. Lots of people assert with great confidence that the system is broken, biased, etc on the basis of their own personal failure to get funded. Frequently without ever having served a term on a study section or ever so much as having ad hoc’d. While there are many problems, even the ones you identify, that crop up here and there this is not necessarily evidence of a systematic problem across all of the CSR and inhouse sections.
I would suggest that if you really think there are structural problems you try to identify them. In my case, for example, I think one thing that induces a systematic bias is the failure to include newly minted investigators who are not yet funded. So instead of whinging about the “old boys club” as an unspecified complaint, I try to identify this specific issue. and to show where it flies in the face of all the other diversity mandates of study section participation.
One thing you could do on this particular blog is ask for specific data analyses, who knows, maybe they will do them for you and blog about it.
I’ve mentioned this before in a previous discussion – are study sections being evaluated independently rather than peer review across the board? For example, it would not surprise me if some study sections showed more gender bias than others. Or perhaps a bias toward more established investigators rather than junior people. Or maybe some grant mechanisms are more successful (such as my favorite, R21) in some study sections vs others. These data must be available to you, and so consistent patterns that may arise can be addressed on a smaller scale. I would appreciate more transparency with respect to behavior of specific study sections.
Comment removed at the request of the submitter 7/21/11 — Rock Talk Blog Team
We continue to look at many aspects of peer review and will update the community as publishable data become available.
I have served on several study sections. The system isn’t “broken” in my opinion. But when there isn’t enough money to go around, it isn’t pleasant. I do wish the SRO had more ability to jump in an help redirect the conversations at times. The new scaled down grant format is a plus in my mind, but not when people are wedded to scoring them the way they socred 25 page grants. It takes time for everyone to get with the new program.
This type of information doesn’t help when you aren’t being funded. It is painful to watch your career diminish, know that the amount of good work you can do is now limited by funding availability, and spend so much time struggling for funding that you can’t actually do any science.
But if you really want to know how things are done, and you really want to change the flavor in study sections, serve when called. Volunteer if you aren’t called. Get in the trenches and show people how it should be done.
These are excellent points.
I also think that there should a mechanism to promote that people, particularly the most experienced researchers, serve on study sections.
Forcing grantees to serve for a minimum period would not work because nobody wants to have an unwilling reviewer, but there could be other options. For instance, favoring that reviewers with a certain serving time could access to awards funded for 1-2 more years. Something that would make service attractive for everybody, besides usual arguments about learning, networking, etc.
We do have policy that recognizes substantial review service and allows these reviewers to submit their own research grant applications on a continuous basis. In other words, you can submit your application when it is convenient for you, and it will be reviewed no later than 120 days after receipt.
That policy is good in theory, but as a frequent reviewer, I can attest that it is not uniformly followed. I have been told several times that my service affords me 2 weeks extra time to submit my own application, NOT “when it is convenient”, and if the review meetings don’t work out just right, it is sometimes well over 120 days from receipt that the application is reviewed. There seem to be lots of new SROs these days, and some of them seem never to have heard of this policy…….
Overall, I agree that NIH peer review works. As noted above, the insufficient budget to support the maintenance (not to mention growth) of science in the US is the major issue. Of course this is outside the scope of what NIH can do.
Having said that, among colleagues there seems to be a consensus (again, this is anecdotal as opposed to metric driven) that some institutes, in particular NCI, have some issues.
For example, mutiple examples exist where a PI holds 3-5 RO1 grants, but his/her productivity (for senior author papers- not collaborations) matches researchers with a single grant. There are issues about the way productivity seems to be measured, especially at NCI, where expectations are not sufficient for highly funded labs, leaving less funding for others.
To extend your point, multi-grant PIs often use the same publications to support more than one of the grants. In other words, the same Nature paper is used to show productivity on more than one grant.
This seems to be a flaw in the system – how can a 1 grant PI’s publication list ever hope to compare to the publications from another PI with 3-5 grants?
PIs should either only be able to assign a published paper to a single grant or should designate a “percent effort” for the paper – i.e. 50% of this paper was due to this grant.
I’d like to see the number of grants received by any one PI be limited, but until NIH offers an across-the-board policy, it’s not going to happen to a great extent. The argument against it is that it’s a good use of funds to support successful researchers. However, it does put money in not only fewer hands, but narrower field of study. Program administrators are reluctant to skip well-scoring applications and then receive the verbal abuse from the investigators.
These are tough times and study sections are doing the best that they can. Is is not a problem with the reviewers: it a problem with the NIH budget, as noted above.
All study sections can probably select the grants in the top 20%. However, when the funding drops to 10% or less, it is NOT possible for anybody to distinguish the grants in the 10th percentile from those in the 12th percentile. So, other things like “filling the pipeline vs keeping established investigators in business”: vs gender, vs geographic distribution vs totally novel risky proposals or carefully developing safe ones creep in, In essence, subjective criteria have to be used to throw half the people out of the sinking ship. Translation; it becomes a dysfunctional system .
I have been in this business a long time and would love to get back to doing my science instead of writing grants. But in the absence of an increase in the NIH budget what can we do?
Perhaps we should consider putting the top 20% of the grants in a hat, blindfolding the head of the section, and asking him/her to pull half the names out of the hat. Then all the blame would go to chance and not to suspected bias of hardworking reviewers trying to do their jobs. Seems reasonable to me!
I like the hat idea. It is not much different from how it works now. I have been on dozens of study sections and there is a random element to how things shake out in the end. Really there is no difference between a grant that scores in the 10th percentile and one that scores in the 15th. or even 20th.
Unfortunately, universities have absolved themselves from supporting research, leaving it entirely to the federal agencies and other sources. They over-expanded during the doubling of the NIH budget, resulting in too much competition for the flat and lower budget.
The peer review system has changed greatly since I served a regular term on study section. One of the changes, the shorter grant format, seems to work better than the 25-page limit, in my opinion. However, I cannot understand the reason for prohibiting nearly all supplementary information after a grant is submitted. Limiting supplementary information to one additional page… OK. Prohibiting any new information from reaching the reviewers…bad. I just submitted an application in which I proposed to use existing ES cells to make a conditional knockout mouse. Two weeks after submission, I found that the mouse had already been made and was readily available. If the study section finds that my grant is weakened because the ES cells might not “go germline,” I will have no recourse (this is an A1 application). Prohibiting the study section from receiving any new scientific information except new publications places an inordinate burden on applicants for no obvious purpose (to me). When I served, we worked a lot harder, met for three days, and considered all relevant information provided by the applicant. “Protecting” the study section members from information that helps them evaluate an application and helps the applicant make their case goes well beyond reasonable, IMHO. Please explain the rationale for this rule…
Since money is tight and the NIH holds a secret about the lousy success of the A1 applications, perhaps it would be better to eliminate the A1. I thought the policy would have included that an A1 would have at least two of the same reviewers; that not being the case there is not a snowball’s chance that the A1 applications are doing well.
While no system is perfect, we are all writing more grants these days trying everything to survive in a very difficult period that is likely to go on for sometime. The NIH should publish the lousy data for funding of A1 applications; and eliminate them as they are wasting institutional resources and precious time of investigators. Perhaps a new policy would then be a bit more lenient about the percentage of the brand new application that must be different.
One of the ideas that needs to be closely considered is to give up the peer review procedure altogether and switch to a procedure that is based on professional reviewing. The idea is to hire interested and trained PhDs as full-time reviewers in different research fields. Here are some of the advantages of such a procedure:
1. Avoid conflict of interest due to lack of direct competitions between applicants and reviewers for the same pool of federal money.
2. Opening a class of new jobs for qualified PhDs and MDs.
3. Save a great deal of the reviewers’ time, which would allow them to focus on their NIH funded projects.
4. Save a considerable amount of the NIH money that is spent on travel and accommodation of the members of the study sections.
5. Full anonymity can be implemented.
I submitted a Neurotechnology SBIR a few years ago and was frankly a little surprised and disappointed by some reviews, which seemed a little cursory and disjointed. I understand time and budget issues may have been a factor, but when dealing with novel therapeutics and industries, you would think there would be more accountability at the level of the reviewer. In regard to the technology, time has proven me right; but now I am flattered with several imitators.
The NIH’s mission is first and foremost to promote health (not some vague notion of promoting the ‘best in science’) and a hefty amount of tax-payer dollars is allotted accordingly. My opinion as a medical doctor, inventor and entrepreneur is that an open peer review process would be the fairest system, all things considered, to protect the fiduciary aspects of this unique relationship between innovators, the funders and the public.
I’ll sign my name to that.
From my experience on NSF panel, NIH comes across as highly entrenched. The study panel’s weight approach (often interpreted as traditional approach) to innovation, which in the long run promotes a lot of duplication and incremental research. I’d like to see more NSF reviewers at NIH and vice versa.
Jack
As a former NIH SBIR Study Section member who submitted an innovative though disruptive technology last August, I was surprised at the comments of an NIH Review Committee last December. This simple and inexpensive medical device has the potential to eliminate the need for incontinence surgery in many cases, yet was internally routed to an NIH group consisting of a 6:1 majority of the surgeons whose income would decrease if this device came onto the market.” Naturally, the surgeon reviewers brought up issues proved patently false in the SBIR Phase I application, gave it high numerical scores and made sure this was not funded. What redress do we have to ensure that reviewers do not vote with their own incomes in mind?
re: “What redress do we have to ensure that reviewers do not vote with their own incomes in mind?”
see http://grants.nih.gov/grants/peer_review_process.htm#Appeals
I strongly agree, people who are funded should have a commitment/requirement to review. It should be part of the duty involved in getting federal funds. It is the least we can do. This brings more senior people to the table. I had a grant reviewed on a study section once on which a recent grad (graduated with doctoral degree literally months before) with no history of NIH funding (I checked!) was sitting on! I couldn’t believe it! How did this person get there?! And it wasn’t a well-seasoned with research Master’ed degree type person, it was a very young newbie with hardly even a publication record to speak of. I only hoped they didn’t get my R01 application to review. Yeesh.
I do acknowledge the downside though of having an unmotivated, overtaxed senior person review. I once had a review where 1 of the 3 reviewers never actually submit their scores. My summary sheets showed some brief comments but no scores in any category from that person. The SRO apologized, saying they couldn’t get the scores from them after several email attempts. Then, about 6 weeks after I got the summary sheets, FINALLY the SRO emailed me the scores that the reviewer FINALLY sent to him. The whole incident made me wonder if that reviewer really put in an effort, which was frustrating when i think of all the time I put into that application. I asked for a re-review but was declined. I don’t know what the solution to that is, but perhaps if their funding depends on complying to NIH review policies it would make a difference.
This statement made in the “Some Thoughts on Peer Review” is misleading: “…And while we assign at least three reviewers per application, the entire study section can view the application, and everyone participates in the discussion and scores the application (excluding those who are in conflict).” The three reviewers assigned to the application decide whether it will be discussed, and in a recent round the committee decided to triage 75% of all applications (50% is customary but this seems to have crept up recently to a larger proporion). Thus 50%-75% of applications are not viewed or discussed by more than the three assigned reviewers. How can this be fair, especially when reviewers are biased or lacking in scientific knowledge adequate to judge a proposal?
Also the notion of an appeal is new to me – this possibility has never been mentioned to me when discussing unfair/biased review comments with NIH officials in the past. I would welcome statistics regarding how often this process results in a reversal and funding.
What happened to the idea of a professional review panel, made up of seasoned scientists who do this work as their full-time job?
The new system of peer review is substantially less deliberative, less scholarly and more perfunctory by design. The new goal seems to be a quicker pace of review stats than finding the most deserving ideas. I have confidence like others here that panels still do the best that they can and by and large peer review panels succeed in their task despite having to work around the rules and the instructions. However, like a lot of things intended as “efficiencies”, this new process of shorter applications is long on the superficial numbers – look at the way staff massages the stats and glosses over the multi-grant PI stats to conclude that there are no problems worth addressing. Does anyone really think that a given member can review twice as many of these short proposals and really have time to think about them? I do not. One key design downside of course is that single reviewers can veto applications with impunity by throwing down a single harsh score against two other excellent scores and effectively put an application in the bottom of the pile. This is all done silently before the meeting but determines the fate of the application. This effective veto is done without defending the “fatal flaws” to their panel colleagues, it is done at home before even convening the study section, and it rewards the reviewer by lessening their work day at the convened study section. Little patience is built in to this model of full tilt review (aka “let’s get through this stack of grants” approach). The new design plus such ad hoc “innovations” like triaging 75% has effectively eliminated a quality control that was prevalent in my tenure as a regular member of a study section – the eye ball to eye ball defense of your point of view. This aspect made me a better and more prepared reviewer. This served the purpose of being both educational to the committee as a whole and built confidence in the review process. Face to face consideration takes time of course and we are all too busy writing grants to expect a committee to actually deliberate before deciding. The review-in-rank-order “innovation” based on initial scores has the effect of heaping the majority of applications in the “not discussed” without a millisecond of committee attention and clearly has the effect of massive triage – despite the literal elimination of triage from the process. The safety measure of personal interaction is the “peer” in peer review and this is steadily being eroded by CSR innovations like asynchronous web reviews (another efficiency that looks great on paper but is less so when you pay attention to the mistakes – which never come out in the sort of post surveys that CSR loves to trot out). I would place the recruitment of junior members mentioned by an earlier commenter as a dilution of peerness as such members are often so disoriented that they bring little to the review besides a plus in the statistics for CSR. Contrast this to journal peer review – particularly at the highest levels. At the best journals, relevant peers are the order of the day and dissenting opinions must be defended but often require additional review. The process of publishing science is intrinsically iterative and the process most often improves the science. Contrast that journal review process with the new CSR approach that seeks to detect the new application that has the tell tale content of a previously unsuccessful application. This process uses unknown detection methods and does not distinguish between that 18th percentile application that just missed the payline and declared “Outstanding” by NIH peer review panelists from the truly dreadful. Is this really how to find and fund the best science? I wish that I could say that there is evidence that CSR is listening and acting accordingly.
As a current study section member, I share some of these concerns. We definitely are reviewing too many applications, although our section is very conscientious, and I for one (and I suspect most others on the section) impinge on our other job responsibilities to give our primary review applications fair and thorough consideration. This overwork isn’t tenable long-term. I guess the obvious solution is having more sections and more reviewers.
Regarding a single bad score dropping applications into the undiscussed pool, I agree it happens, perhaps a lot. At the same time, when we reach the cutoff percentile for discussion, we are always invited by the chair to bring up any lower-ranked applications, and those with a large 2-1 gap in scoring are sometimes highlighted specifically to ask whether discussion is needed to resolve the large difference in critique. Admittedly, fatigue and wishes to end on time play some role in these decisions. I suppose you could argue this comes back to the need for more sections and reviewers, ensuring that more applications can be thoughtfully reviewed and thoroughly discussed.
In the end, I agree with others that the greatest influence on application outcomes here is the overall NIH research budget. Being ranked in the top 10 percent of applications may bring you bragging rights but little else when the payline is 6th percentile. The funding drought is causing many excellent research ideas to die on the vine.
The truth is: we are in a financial crisis and personal bias is creeping in more than ever. I was shocked when a reviewer completely missed my enrollment table. You work for months , set up a fantastic team and a fantastic idea, your whole career is on the line and the reviewer does not even bother to read your grant and trashes your application for a correct statement that you made (it can be proven mathematically and many seniors agree) and a correct enrollment table, and also because he thinks that his disease, which is not fatal, is more important to humanity than yours. How can such incorrect judgement be allowed? How can he still be in the study section? His friend who is working on the same topic as he is had his grant funded by this reviewer’s study section at the same time. It has been almost a year and I still have to find the courage to apply again knowing that nothing matters. I can’t decide what I should do, leave science or fight against the wind. With such tight funding levels, it only takes one bad reviewer to completely sink decades of studies and preparation with no reasonable justification. My solution is that SRO should step in and stop the madness when it occurs. My other solution is to actually find reviewers less biased (e.g. from the industry, retired, from other countries, etc.) and more prepared. There is no accountability, there is no quality control. This is the problem.
As a scientist continually funded by NIH for over 20 years and who has served as a chartered member of an NIH Study Section (SS) as well as many ad hoc experiences and as an SS chair, it is my opinion that an insufficient number of senior scientists are serving as regular SS reviewers. I believe part of the reason is that we are not asked – and as evidence of this point I share this one (of several) experiences with internal NIH staff: I have been called and asked to recommend ‘bright’ ASSISTANT Professors to serve on NIH SS; including those with no NIH (or other) funding. This makes no sense to put individuals with no grant funding history onto these panels, especially without a sufficient number of senior investigators in a field so that they learn from their experience. After being awarded my first R01 and was asked to ad hoc on a committee (that I eventually joined), the vast majority of reviewers were the key people in my field and it was both a pleasure and a wonderful learning experience: to learn and appreciate the give and take of ideas and opinions as well as how to – and having the will to – make your points clear and ‘fight’ for a good application instead of being a passive bystander. (That said, if you do not have enough experience to be able to understand the topic, the field and the history, then you have no business being the reviewer.) Also having served ad hoc recently on some NIH SS I have seen and heard some of the junior inexperienced individuals so that I am aware that this push for ASST profs on SS is real.
Of course we all know that when there is no $$ in the Federal purse many good applications go unfunded – this makes it all the more important that Peer Review is the best it can be – which I believe at the moment it is not.
Anon from July 22nd again (I’d post under my real name, but tenure looms):
The best two ideas I have heard regarding helping during these lean times are:
1. Limit the amount of support one lab head can have. Not draconian, but perhaps two R01s and inclusion on a P01 tops. It’s not clear how much money this would free up for distribution however. Everyone knows of one person with 3 or more R01s, but the NIH says this isn’t that common.
2. The best idea in my mind, and the one probably impossible to implement, is to limit the amount of PI salary that can be drawn from an R01. Many institutions already pay 9 month salaries for their employees. However, if you work at a medical school you need 1 R01 to pay your salary, and another to run the lab. Medical schools need to start picking up the majority of the salaries of people they hire and stop the soft money circus. If they can’t pay these people, they shouldn’t hire them. I know it’s a pipe dream, but imagine how much less stress there would be in general if people weren’t trying to cover their salary and were just asking for money to support the workers in their labs and the resources. Imagine all those R01s needed for PI salary free to go to other labs. I know there are problems with this idea, but I’d still love to see it considered.
Any thoughts on this?
I know how expensive medical tuition is – I just don’t see how they couldn’t pay their own faculty salaries and still make a profit, especially since so many med school faculty wind up working in affiliated clinics and hospitals while they’re faculty anyway.
On the contrary, the commitment required on a project should be increased. I have seen too many applications with 1 month commitment from each of the investigators. Probably because they had three or four other grants, and didn’t have any more time to commit. But they continue to submit more applications.
While the review system is not broken, there is a significant element of chance in who gets funded that is not acknowledged. As others have said, one can often judge the “best” applications to about top 20-25%. When the funding levels are at 8-10%, there is a large amount of arbitrariness in who gets to be in that top 8%. A lot depends on the three particular individuals you get. Their backgrounds and interests decide everything, if the grant is good. A negative comment by any one of three and you are basically out. Sometimes, a reviewer simply does not like certain kind or research because they are coming at it from a different background and training. So the review can be unfair without the reviewer being bad or nasty. It is just your bad luck that you got a person with that type of background as one of the three.
There are other factors, such as who is enthusiastic about speaking up and who is enthusiastically criticizing. Often, there are two excellent reviews, and just one negative comment in one review, and the scores are in line with the worst point, as if the other two excellent reviews did not exist. There are also reverse cases, where a written review is scathing but the scores are good.
The only way to minimize this ‘luck of the draw’ aspect is to have more reviewers (i.e., more datapoints or samples) that are close to the particular field. If there are 5 or 6 reviews, one can be more tolerant about one negative comment from one person. Individual biases due to unfamiliarity or a difference in background (we all have them) would be minimized.
A downside, of course is that it would be a lot more work. You would need more reviewers, and they would likely need to review more grants. This can be done if the reviewers are paid well. Some of the cost savings from reducing physical travel (online conferencing) can be put towards paying reviewers and increasing the number of reviewers.
A real solution would be to fund the top 20-25% instead of the top 8%. Ways of funding more excellent applications should be explored as a topmost priority. Many ideas have been suggested, including (1) look at expensive vs. inexpensive research that is of equal value – smaller grants are okay for many types of research (2) reduce priority of the PI after 3rd or 4th RO1 (3) stop 80-90% of salary support from grants in med schools (4) reduce indirect costs and so on.
From what I have seen, the more reviewers on an application, the broader the range of scores. Then the average of all the applications overlap more and it’s even harder to distinguish outstanding versus excellent. Also, the longer the application is discussed, the worse the score. More reviewers assigned to the application results in longer discussion and poorer score.
The real issue is not the limited number of resubmissions, but (1) that applications are no longer reviewed by the same individuals in the study group upon resubmission, and (2) that the shorter application length–and thus, less detailed methods section, encourages lengthy “what if” conversations during reviews, because panel members can imagine all sorts of shortcomings, that tend to ding smaller (i.e., less well-known) researcher outfits, while benefiting larger research groups.
With respect to issue One: Before, applicants could address reviewer comments with gusto, and in my experience, most applicants were more than willing to thoughtfully consider reviewer comments and amend their application as necessary. Then, upon resubmission, the same reviewers, or at least, 2 out of three, would generally re-review the application and lead the peer-review. Most of the time, the reviewers would remember the previous discussions and concerns, and would be able to weigh the progress of the application. At times, but not often, resubmissions were scored more harshly on the resubmission. This occurred for three reasons: the applications were unresponsive, the revisions did not solve the original problem, or, more rarely, the revisions made clear additional weaknesses in the application, occurring most often among heavily amended applications that were either unscored or very low on the first round. However, now, applications fall into the hands of new reviewers who cannot fully appreciate the revisions and prior review. Try as hard as they may, they treat the application as a first submission, and regardless of the quality of the revisions, new issues attenuate enthusiasm after lengthy group discussion.
With respect to Issue Two: Scientists, by nature, are doubtful; hence the convention of stating the hypothesis in the negative form. Despite the goals of the NIH to discourage pedantic review of applications in favor of ‘big picture’ thinking, the shorter revisions and page length has increased this tendency—especially among applications submitted by less experienced/less well-known researchers, in terms of whether or not the methods are appropriate for the stated aims of the proposed research. When in doubt, reviewers tend to conclude, no–the lack of detail is not reflective of the shorter page length, but rather, of the lack of requisite skill and experience to achieve what is proposed. However, when the application is well-known, then discussion around the table takes on an acquiescing tone, and concedes that surely the applicants are aware of the potential methodological problem, and will conduct themselves accordingly. In short, the Mathew Principle.
Moreover, the suggestion that scores reflect the rooms’ reviews and not the reviewers’, is disingenuous. Anyone who has sat on a panel knows that the range of scores is restricted because (1) panel members are instructed that any score outside of a small range from the three assigned reviewers’ scores is not permissible without first stating one’s case for the out-of-range score to the room and (2), only the assigned reviewers have read the application, so the discussion around the table reflects chiefly the comments raised by the reviewers. Thus, when a new reviewer is assigned, who opens up new discussion, the room is swayed, and attention is drawn away from whether or not the revisions were in fact sufficient to address original concerns. (For the record, I support the limit on the range of scores, but for this to work, all scores must read the application.)
Lastly, the NIH representatives in the room must be bolder. When statements from the panel, such as, “Researcher Smith is at XXXXXX University, and surely they know…” or “I know Researcher Smith and s/he has always…, “ or “the applicant has done similar research in the past and I’m sure s/he knows that…,” or “So-and-so is a trained neurosurgeon at XXXX university—they’re surely bright enough to…” begins to enter the conversation, then panel members should be reminded to review the application based on its merits, and not the applicants’ employer’s reputation.
I’d still like a clear explanation as to why applications are not anonymized, at least in the beginning.
Even during the appeals process, it is hard to catch a conflict of interest that revolves around competition rather than collaboration, mostly because the reviewers may not know the specific field well enough to know who the competing labs would be. From what I’ve read, the reviewers barely have enough time to read the grants, much less research who else is in the field. I had a postdoc mentor who even I considered paranoid – he regularly complained that publications and grants were triaged or delayed because of “this guy” doing the same work (or wanting to get credit for doing it). But I’ve also had a grant reviewed (NRSA) where a particular reviewer’s lab member was cited when the reviewer suggested the hypothesis was wrong (this was the old system though).
It seems if the initial review was purely related to only the scientific criteria – innovation, a strong hypothesis, etc. – and the author’s and location’s information held back temporarily, this would allow the actual scientific merit of the grant to be studied without bias. If its scientific merit is strong enough, the grant should be funded, even if on a trial basis. For this the reviewers are unblinded and can then decide based on the rest of the criteria whether the grant should be funded at a level that would, for example, allow a year and supplies to do one set of experiments to demonstrate the ability to do the work proposed in the grant. Then let the grant update be used to determine if the project should be funded fully or not, or if progress wasn’t good enough and the grant has to be back-burnered or cancelled. It might make the accounting trickier for the NIH, and there would be a risk that some funding might be lost to poor experimenters, but that second risk is always there anytime new investigators are funded. It would, however, eliminate the bias created when grantees are in direct competition with their reviewers or reviewers’ colleagues. Given the insular and increasingly specialized nature of research, I worry that this is becoming more of a problem than it used to be.
It might also allow people with new ideas or perspectives the chance to change fields – something I have personal experience with. I have been triaged on a grant a couple times where I propose experiments related to autism research. The comments on the science/innovation score are generally good and the scores high (2/3). But when they look at my publication record they dock me points because I have no pubs in autism itself. I do have pubs in studying arousal behavior, and nothing I proposed was an experiment I hadn’t done in that field, so not out of my league technically. In other words, it’s all neuroscience, but because I hadn’t studied under an autism “elder” I was getting frozen out – and in the meantime, my idea has gained traction through work done in other labs.
I didn’t write that just to post my sob story – I know funding’s insanely tight and I’m not griping just because I didn’t get any. And I’m using my very limited funds to try and get some autism-related data I’ve collected out there. I wrote that last paragraph to suggest another form of bias that might go unnoticed, and that might be helped if things were anonymized at first. Under an anonymous system the idea in the grant would have been the only thing they saw, and apparently the idea was good enough to be funded in other labs.
The problem with “anonymous reviewing” is it can’t be truly anonymous. The more senior the author, the more obvious who it is from their contributions and from the plan / reagents available etc. DOD is trying this for some grants, insisting on “X was observed (ref.)” rather than “we observed (ref.)”, but there are many giveways, as when I have reviewed “anonymous” manuscripts. I am afraid we are stuck with known PIs and anonymous reviewers, but I for one like the idea of professional reviewers further above. I’d happily stake my productivity against larger groups in front of dispassionate observer scientists, rather than equally vulnerable PIs !
Here’s another idea for an objective metric to include in overall evaluation: papers x impact factor* / grant $.
That would show REAL efficiency and get round the other well-known issue of bundling mentioned above (every grant from the PI listed on the best paper so each grant takes the credit).
Good luck to us all (except of course, that won’t work either, with < 10 percentile paylines cutting our throats) !
*there are better IF-type measures than classic IF, which biases for (e.g.) reviews being cited more often than papers.
It is a fair degree of a whitewash to say that everything is fine with NIH peer-reviewing.
What are the problems:
1) Not enough funds to go around (yes, we all know that).
2) Age bias in favor of older “seasoned” PIs.
3) Conservatism, “risk aversion” when it comes to new ideas.
4) NIH provides maintenance of political pecking order in funding established by reviewers and PIs. The “get in line” attitude.
5) Institutional/Regional bias: “Here at Ivy U. we invented Ivy.” “They couldn’t possibly do science at Wheat U.”
6) The numbers of grants reviewed makes reviewers highly motivated to “Kill” grants. (The assumption is that NIH wants this).
Ways to fix it:
1) Change peer-review completely – involve the public stakeholders. Choose reviewers randomly (really)- should be random w/ respect to age group and expertise. Protect the young PIs from the old PIs, and the young from the old, and the old from the old.
2) Do all reviews independently, ie, in your own office, submit scores electronically, no face to face SS where strong personalities can “hog-tie” the committee. why make the whole SS put scores within a small window unless you are trying to hide the truth about how widely scores can vary.
3) Eliminate the 5 R01 feast. Small grants, and more of them, I mean many more of them.
4) If you really want to eliminate bias, names and institutions should be removed from grants and reviewers should be chosen randomly.
I have no qurel with the overall general review process. In fact I think it is very good. However, I recently found a rather disturbing policy procedure that – to a certain degree – completely undermines the whole SBIR process.
We completed a NHLBI Phase II grant and then submitted a Phase III (or IIB) proposal to complete our FDA submission and take our device further to market.
The NHLBI mandates that we should communicate with the FDA and then propose those tasks that the FDA states will be needed for obtaining FDA clearence.
We did this extensively and formatted our proposal completely with that mandate in mind. It was nicely review and was not funded pending our response to several technical questions. We rewrote the proposal, addressed the review comments, and resubmitted. The proposal was then sent to a different panel than the one we had requested for some reason. This panel completely rejected our proposal. From the review comments it was obvious that the panel thought we should do things completely opposite than what the FDA suggested – namely one thing the FDA said we needed to do additonal testing on humans using a functional prototype of our device. The review panel said we needed to do animal testing and/or take the data from the literature. This was only one of the many critical comments that led us to believe the panel was not aware of the FDA mandate. Upon inquiry to our NIH monitor we were told that the panel was not instructed as to the mandates and that our proposal was essentially reviewed in the same stack as all of the Phase I and Phase II proposals even though the basic format and direction may be entirely different.
We then did file an appeal in which we detailed all of our concerns. It did go to council and we were told that even though the panel acknowledged all of our points the NHLBI still concurred that the paoposal should not be funded and that even though our points were correct they would not change our review score. They also stated that there was enough things wrong with our “Significance” and “innovation” sections to warrant a rejection. This is hard to fathom since the first review panel accepted our version of these sections and that these two sections had been reviewed a total of 6 times during our Phase I, Phase II, and initial Phase III proposals.
If we are told that we can propose for a Phase III to complete our FDA submissions and then this is not even considered in the review process something sure seems to be flawed somewhere.
Please let me know if you desire additonal information. I welcome a good technical review but not this kind of review. We were even urged to place our devices in the hands of physicians and to ask them to use them on patients even though they would not be cleared by the FDA. This and other comments raise the question – does the review panel even know details about regulatory affairs sufficiently to review a Phase III?