Addressing Rigor in Scientific Studies

Posted

Guest blog by Devon C. Crawford, Ph.D., Program Director, Office of Research Quality, NIH’s National Institute of Neurological Disorders and Stroke. Originally published on the NIH Science, Health, and Public Trust blog.

Photo of Devon Crawford
Devon C. Crawford, Ph.D., Program Director, Office of Research Quality

Science communication is rapidly evolving. The growing use of preprints and the sheer number of published studies make it increasingly difficult to determine which findings are worthy of attention. Not all scientific studies are created equal. Communicators need to discern which are reputable in order to know what to convey to their target audience. Inaccurate or untrustworthy information can have dire consequences, so it is important to understand how to assess whether studies have robust findings and how to communicate this to audiences.

Science communicators need to describe the major conclusions from a study, along with its implications for future research and public health practice, without overstating the results. Science is a continual process of updating knowledge that is conditional on how the results were obtained; it is not a series of discovered “facts.” All scientific conclusions are subject to interpretation, and all have some degree of uncertainty. Responsible science communicators will report important details of a study: the number of subjects, species involved, techniques used, major outcomes, and caveats. But, even this level of reporting does not provide enough information to know how much to trust the results.

As NIH has been emphasizing for more than a decade, the rigor and transparency of a study are key for gleaning the robustness of its results. This includes the design, implementation, analysis, and interpretation of experiments. If a study’s validity isn’t known, the rest is moot.

How does one know if a study is rigorous? And how can this be communicated to broad audiences? A single person can’t keep up with all of the limitations of every scientific approach, and even savvy readers of original research articles need to beware of mistaking “spin” for reasonable conclusions within a given study. Fortunately, there are some generally agreed-upon principles of rigorous research that apply across fields and methods.

Transparent publications follow established guidelines to ensure that important research practices are reported. These include the CONSORT statement for clinical trials, ARRIVE guidelines for animal studies, and PRISMA statement for systematic reviews. It is difficult to assess the rigor and robustness of studies that do not fully follow these guidelines. Yet, many papers do not report important practices.

Given the limited adherence to the lengthy lists contained in these guidelines, NIH’s National Institute of Neurological Disorders and Stroke (NINDS) held a workshop in 2012 to identify ways to enhance rigor-related transparency. The result was a list of four important research practices that should be reported for every relevant research study, from basic research to clinical trials:

  • Sample size estimation: how the sample size, such as the number of participants, was chosen before the study began, ideally via statistical extrapolation from prior studies.
  • Blinding/Masking: how experimenters ensured that none of those involved in the study knew which samples or groups received the intervention being tested.
  • Randomization: how experimenters selected groups for treatment or data analysis so that all had the same chance of undergoing an intervention.
  • Data handing: how experimenters planned in advance to manage missing data, outliers, what data to exclude from analysis, and when to stop data collection.

Without these practices, studies are likely to be at high risk for unconscious biases that can lead to incorrect conclusions. Depending on the goals of the study, not all these items may be relevant. However, it is still important to report whether they were relevant and used for a particular study to make it clear which measures were taken to reduce bias. Other important experimental design and planning elements that should be considered include transparent reporting of outcome measures, using appropriate control groups, defining clear measures of uncertainty, and addressing study limitations such as possible confounding factors. If the study is exploratory, if the above practices are not reported or performed, or if other relevant reporting guidelines have not been followed, the results should be interpreted as preliminary or tentative, and any conclusions should be communicated accordingly.

Importantly, each study must also be put within the appropriate context of the wider scientific landscape. Are the results consistent with previous studies? Were those previous studies rigorously performed and transparently reported? Is there a chance that these results were only published because of the exciting result and not because of the rigor of the methods? Could there be similar or more rigorous studies out there that were not published or publicized because they did not get this exciting outcome? In other words, how surprising was this result? The more surprising, the more robust the evidence needs to be to support it.

Science communicators can’t be expected to assess all these questions themselves. Communicators should ask the study investigators and other experts in the field for support in verifying important design elements of a particular study. If information is missing, why hasn’t it been addressed? Without this information, the finding cannot be properly interpreted.

Communicating science accurately and responsibly is a balance between engaging the audience and providing enough important details. NIH’s NINDS explicitly emphasizes that rigor and transparency “are essential to enable the scientific community, as well as the community at large, to assess the value of scientific findings.” They shouldn’t be left out of communications about research. Addressing a study’s rigor, transparency, and robustness not only provides important information about reliability, but it also signals to the audience that context, and by extension the ongoing process, is central to science.

2 Comments

  1. From my observations as a journal reviewer and general reader of literature, one of the most frequent and most impactful issues in scientific rigor is misinterpretation of histology data. Often, the reader is only provided a small area from one representative sample to look at and is forced to assume that other parts of that slide, and other slides, show similar findings. In 2023, nearly all scientists have access to a way to scan slides. Although digitized slides are large files, so is genomic and transcriptomic data. Everyone is expected to include their full -omic data upon publication, and I think extending this policy to mandatory uploading of digitized histology slides would be very beneficial to rigor, and would additionally allow mining of additional insights from those slides.

  2. Thank you for this important reminder for inclusion of the important rigor criteria that are often omitted from manuscripts. I hope that bringing attention to this important topic will spur authors to pay more attention to these key facts.

    One thing that I don’t think came through is that when blinding and randomization are not used by researchers the effect size reported in the study seems to double. This was true in the original behavior study (1963 Rosenthal and Frode) and seems to be true for each disease area studied (see most of the MacLeod Lab papers). This consistency means that the effect is generalizable, and to that end, we should all be quite concerned because it probably happened to us!

Before submitting your comment, please review our blog comment policies.

Leave a Reply

Your email address will not be published. Required fields are marked *