Looking at Reproducibility


Many of you may have seen the recent Nature Commentary by NIH Director Francis Collins and NIH Deputy Director Larry Tabak that talks about how NIH is addressing concerns about reproducibility in science. If you haven’t I’d encourage you to take a look.

The topic of reproducibility is not new, and there are a number of NIH institutes and offices that have completed or are embarking on projects that contribute to the goal of improving rigor in the design and methods used in research. In addition, NIH has a number of programs and policies that more generally support the goals of reproducibility, for example, PubMed Commons, data sharing and public access policies, and more.

Improving the reproducibility of biomedical science is critical to NIH’s mission. Feedback from the community on these pilot projects will inform us about the approaches to adopt and implement agency-wide. So stay involved and stay tuned…there will be more to follow.


  1. It is far better to focus on the extendabilty or predictability of experiments than on reproducibility. Focussing on reproducibility invariably leads to conflicts over priority especially when publications, grants, or even new concepts are held up in review and scientific debate. The aim of reproducibilty is to copy which leads to the mind-set of a plagiarist. Demanding immediate reproducibilty also puts quality science and truly original scientists ‘on hold’. Let time and reputation be the judge and jury for the rewards and punishments in science.

    1. I am sorry but the comment of “Citizen-scientist” is completely of the mark. Results and data that are reproducible is the ONLY thing that counts for science by definition. Work that cannot be reproduced after repeated and honest attempts is probably wrong. Work that is not reproduced will be forgotten and is probably not important. It is unfortunately all too common that “important” work published in “luxury journals” and performed in a rush “to beat the competition is very often wrong and misleading. We need to constantly recheck results to test if they are really correct. Labs that have reproducible results are the ONLY ones that are really high quality.

      1. I agree for the most part that reproducibilty is important, but it doesn’t necessarily yield truths. Artefacts are often reproducible by one lab or many labs-until the system/theory is fully understood. Results from a mouse model may be highly reproducible but totally inconsistent with the human disease counterpart (eg. sepsis-inflammation). Extendability and predictability of results are thus far more important for obtaining an understanding of nature and human disease.

    2. Good for you! How does one learn what “reproduceability” is until there has been a consensus established. One can never force “reproduceability”. It sounds way too much like politics rather than real science.

  2. The solution to this dire crisis of reproducibility is staring us all in the face: Mandatory pre-registration of all research studies. I was tremendously disappointed that Drs. Collins and Tabak did not announce such an initiative, since evidently they have a very good understanding of the sources of the problem. Focusing on education feels like a fig leaf – the NIH doing “something” to improve reproducibility but not really doing anything. Scientists are very well aware of the damage in cherry picking and selectively reporting “publishable” findings; training is not the main issue.

  3. There is nothing more important than reproducibility. It needs to be built into every study. Multiple replications, systematic replications all need to be part and parcel to completing a single study before it’s published. As part of the old guard, with more than 35 years experience as a published and funded scientist, I have seen new scientists run single experiments and think they have found something. They don’t serve as their own best naysayer. Whenever one runs an experiment they are obligated to prove themselves correct. This approach needs to be reinstated as part of scientific protocol. Often what I see driving science is the desire for publication and funding, not the seeking of truth. When replication is part of the process in every laboratory, many findings will not see the light of day as they will not pass the critical test of being reproducible. The time is now. We cannot wait any longer to fulfill the obligations that we have to be skeptics of our own work.

  4. On the topic of reproducibility I would strongly suggest that NIH starts by cleaning (it’s own) house before (properly and laudably) demanding standards from the rest of the community. In my field, and over the past 5-10 years, the worst offenders in that regard come from high visibility (and lavishly supported) intramural researchers. Once some Nature/Science/Cell papers get retracted due to lack of reproducibility (and that, in turn, due to poor science) then the rest of us might trust that NIH is serious about this.

  5. I’d put my data up for scrutiny any day.

    However, deciding how to address the issue of reproduceability is feeding into the mindset of legislators and administrators rather than scientists. Science tends to be self correcting on longer timescales – otherwise, you’d have no idea that there was a problem in reproducing results. The real problem, then, is that science is not self-correcting on the timescale of grant funding and legislative budget cycles, and the fear I presume is that grant funds will be wasted on studies that cannot be immediately reproduced.

    It seems that NIH needs to stop the practice of cutting grants to the bone to fund more proposals – so that funded studies can be sufficiently powered.

    Or, some sort of administrative supplement or small grant program should be available to allow a PI to collaboratively establish reproduceability of major findings (perhaps a requirement of such a supplement or grant would be that the work would be carried out in other than the primary P.I.’s lab).

    Frankly this overall issue is going to be a problem until leadership finds a way to make the case that science should be funded at a sustainable level. We simply can’t have such an expansive portfolio with the success rates we currently enjoy. Society is electing to reduce the power and influence of American science, and we just need to either form a protest movement (like a million scientist march) or face up to reality that the only thing we can afford to do is fold up the tents. What I would NOT do is take precious grant funding and direct it at studies of studies, which would be greatly ironic but ultimately the sort of thing plaguing academic medical centers these days, with hours lost to regulatory matters and process.

  6. I like the comment on extendability. I was interviewed for a DPhil place by Geoffrey Harris – the giant of endocrinology who showed that, contrary to the view held by Zuckerman and others, the pituitary is controlled by specific factors form the hypothalamus and Rosalyn Yallow said would have received the Nobel with her if he had not died.

    Harris asked me at the interview whether specific experiments should be independently conducted for confirmation. I thought I could second guess him and said “Of course.”

    He replied “No. You should design each new study so it encompasses the conclusions of the earlier study but extends it so that while the previous observation is confirmed at the same time.”

    I presume this is what Citizen Scientist meant by “Extendability.” Extendability is a step forward while pure reproducibility is just marking time.

  7. Reproducibility is a serious problem that goes beyond the academic laboratories. I have been working in the pharmaceutical industry for 24 years. Most of the science published in journals is not reproducible using robust industry methods. Top and standard Journals publish ‘discoveries’ that cannot be reproduced. If your ‘discovery’ cannot be reproduced in the production and manufacturing floor you do not have any product. If you look at the statistics, small and large companies spend billions of dollars in licensing technologies that do not work. Many published papers are the result of single rushed experiments with lots of speculations and unproved hypothesis. I suggest scientists to verify their findings thoroughly and then ask their colleagues to verify the findings before rushing to the technology transfer office to “disclose the invention” and its ‘publication’. Journals should not accept manuscripts unless the experiments have been verified by a second independent group to stop all the low quality submissions, plagiarism, copy cats, and so on. This action will save millions of dollars in writing and prosecuting useless patents; will stop publishing of questionable ‘discoveries’, and will save millions of dollars to small companies for licensing technologies that do not work.

    1. It’s been my observation (more than once) that very often “irreproducibity” during tech transfer has as much to do with problems in the recipient as it does in the supplier. I’ve seen unwillingness to troubleshoot scale-up of compound synthesis, unwillingness to adopt protocol differences in the supplier lab (including route and vehicle), and insistance on using techniques that the recipient is more comfortable with. The technology is then judged “unreliable” and returned. When a smaller, more invested company then licenses the same tech, they are able to reproduce initial findings easily and troubleshoot subsequent steps. Understanding that undocumented (and often unknown) details are critical to new technologies seems to be underappreciated in industry, especially in major firms where NIH (not invented here) is still an issue (though not as big as it used to be).

  8. Reeks of idealism in these comments. For those of you who aren’t hunkered in the frontline trenches of this travesty we call academic science, let me sum it up for you:

    1. You get hired as Asst Prof. Your chair demands you get R01 funding or you lose your job, without regard to how well you teach, mentor etc…

    2. If you don’t obtain a pedigree in graduate school/post-doc, you are on the outside looking in. Only way to get inside the circle controlling a study section is to publish lots of papers in a short period of time (or fly to a lot of meetings and invite them for seminars, then stroke). You do your best to be accurate, but you can’t afford the time/money to be super careful.

    3. You publish, hopefully get your R01, survive the tenure process. Then, and maybe then, you can be careful. Most don’t bother. Why? It worked up ’til now, so keep going. Get more grants. More power. More influence. Become a permanent study section member. And the beat goes on…

    What you are asking is for scientists to fall on their swords and admit when something they did was wrong, sloppy, etc… I’ve met professors who had the courage to do that. They sit in their offices all day, across the hall from their empty labs (if they haven’t been taken away) wondering when their chair is going to kick them out, and what they will do next. Meanwhile, the “scientists” who aggressively game the system reap the harvest.

    You want to change the system? Good luck. No matter how much bureaucracy you create to enforce “reproducibility”, nothing will change. As long as billions of dollars are pumped into the system, the wolves will come calling, and they will protect each other. Three labs will join forces and “reproduce” each other’s data… easy loophole to beat. Take comfort that within the sea of fools are a few bright, sparkling, shining stars who actually discover important, new things and move the imperfect human race forward. Most things in this universe are horribly inefficient… why should our system be any different?

    Limit the number of R01’s a lab can have to 2. Do that, and everything else will fall into place. Don’t listen to the power mongers telling you this is a bad idea… as we learned in our ethics classes, this is what we call a “conflict of interest”. If their research is that damn important (and reproducible), the private sector will fund it. Plus, with all that extra time not spent writing grants, they could actually mentor a few people.

  9. Scientists are the only people that should be responsible for testing the reproducibility of data. There are specific ways to raise the stakes on reproducibility:

    1) NIH should include the presence of non reproducible findings in the scoring of grants.
    2) Journal editors must be more critical about claims of generality, specificity and causality.
    3) The authors/investigators must be required to address the following issues in their papers:

    a) Do not claim a result is general if you have used only one cell type.
    b) If you claim reagent specificity say how that was measured. Obviously no one checked it on all possible components of a cell.
    c) State explicitly the assumptions that were used in the analysis. For example, if any intervention caused a change in cell shape, that is an explicit statement that there were changes in forces in the cell. If you have ignored that in you analysis, say for example, “We have ignored any coupling between [the observed variable] and the mechanics of the cell.
    d) Do not confuse correlation with causality unless you are also willing to claim that the roosters crowing caused the sun to come up.
    e) Quantitative data analysis, such as modeling, makes the assumptions clear.
    f) If you ever use an amphiphilic reagent, be aware that they will all alter membrane structure, and if you haven’t checked for the influence of those effects, say so.

    And so on……….don’t published a paper that doesn’t make testable predictions.

  10. I agree with ‘Someone has to say it’ – the rewards system is the root cause of this problem, and there is no way to fix it without fixing the reward system.

    The system as it is tries to support research ‘productivity’ by rewarding number and impact of publications.

    There is an enormous weight of evidence that rewards targeted directly to performance reduce inherent or ‘intrinsic’ motivation to do the task make people concentrate on the rewarded outcome [1] and perform less well, when the task needs creative thought [2].

    In our case, the system of rewards tend to cause scientists to:

    1) be cynical (‘gaming the system’);
    2) concentrate on publications at the expense of accuracy and scientific progress and;
    3) choose easier questions for which they can get more reward more quickly.

    It is difficult to imagine that short training courses and encouragement to reproducibility will be effective against these forces.

    It seems to me that there is a strong parallel between our current situation, and that of the US auto industry, as it declined in competition with Toyota and other Japanese car makers [3]. The US industry had adopted a culture of reward and control that led to a relative stagnation and collapse in quality and productivity compared to Toyota, that did not have such a culture.

    We don’t have a Toyota to compete against, but it would be tragic if, like the US auto industry, we failed to see that we are now reaping the harvest sown when we adopted a culture of contingent rewards a generation or so ago.

    [1] Deci, Edward L., Richard Koestner, and Richard M. Ryan. “A meta-analytic review of experiments examining the effects of extrinsic rewards on intrinsic motivation.” Psychological bulletin 125.6 (1999): 627.
    [2] Kohn, Alfie. “Punished by rewards: The trouble with gold stars, incentive plans, A’s, praise, and other bribes”. Houghton Mifflin Harcourt, 1999.
    [3] Helper, Susan, and Rebecca Henderson. “Management Practices, Relational Contracts and the Decline of General Motors.” Harvard Business School Working Paper, No. 14-062, January 2014.

Before submitting your comment, please review our blog comment policies.

Leave a Reply

Your email address will not be published. Required fields are marked *