NIH grants reflect research investments that we hope will lead to advancement of fundamental knowledge and/or application of that knowledge to efforts to improve health and well-being. In February, we published a blog on the publication impact of NIH funded research. We were gratified to hear your many thoughtful comments and questions. Some of you suggested that we should not only focus on output (e.g. highly cited papers), but also on cost – or as one of you mentioned “citations per dollar.” Indeed, my colleagues and I have previously taken a preliminary look at this question in the world of cardiovascular research. Today I’d like to share our exploration of citations per dollar using a sample of R01 grants across NIH’s research portfolio. What we found has an interesting policy implication for maximizing NIH’s return on investment in research.
To think about impact per dollar across the NIH research portfolio, let’s look at a sample of 37,909 NIH R01 grants that were first funded between 2000 and 2010. When thinking about citations per dollar, one important consideration is whether we are largely looking at human or non-human studies.
Table 1 shows some of the characteristics of these grants according to whether or not they included human subjects. Continuous variables are shown in the table in the format a b c, where a is the lower quartile, b is the median, and c is the upper quartile. The total award amount includes direct and indirect costs across all awarded years and is shown in 2015 constant dollars (with inflation adjustment by the BRDPI). “Prior NIH funding” refers to total NIH funding the PI received prior to this specific award.
As might be expected, grants supporting research on human subjects were more expensive and more likely to involve multiple PI’s. Human studies were less likely to be renewed at least once.
Next, let’s look at publishing and citation outcomes for the same group of grants, broken out by whether the study involves humans are not. Similar to what I showed in my prior blog, I show a “normalized citation impact”, a citation impact measure that accounts for varying citation behavior across different scientific disciplines, but now divide that by total dollars spent. We’ll do this using box and violin plots to show the distribution of normalized citation impact per million dollars according to whether or not the grant included human subjects.
The shaded area shows the distribution of NIH-supported papers ranging from the most highly cited (100 percentile) to least cited. Note that the Y-axis is displayed on a logarithmic scale. This is an important point – scientific productivity follows a highly skewed “heavy-tailed” logarithmic distribution, not a simple normal distribution like human height. The log-normal distribution of grant productivity is evident, though with “tails” of grants that yielded minimal productivity. The log-normal distribution also reflects that there are a small – but not very small – number of grants with extraordinarily high productivity (e.g. those that produced the equivalent of 10 or more highly cited papers). We also see that by this measure, grants that focus on human studies– in aggregate – have less normalized citation impact per dollar than other grants.
Another approach to describing the association of citation impact with budget is to produce a “production plot,” in which we examine how changes in inputs (in this case dollars) are associated with changes in output (in this case, citation impact). Figure 2 below shows such a production plot in which both axes (total award on the X-axis and citation impact on the Y-axis) are logarithmically scaled. This kind of plot allows us to ask the question, “does a 10% increase in input (here, total grant award funding) predict a 10% increase in output (citations, normalized as described earlier)?” If there is a 1:1 relationship between the input and the output, and a 10% increase in funding yields a 10% increase in citations, we’d expect a plot with a slope of exactly 1.The trendlines/curves are based on loess smoothers, with shaded areas representing 95% confidence intervals. We see that the association between the logarithm of grant citation impact and the logarithm of grant total costs is nearly linear. We also see that over 95% of the projects have total costs greater than $1 million, and less than $10 million for the lifetime of the grant, and in this range the association is linear with a slope of < 1 (whereas the dotted line which has a slope of exactly 1). Not only is this pattern consistent with prior literature, it is illustrative of an important point: Research productivity follows (to some extent) a “power law,” meaning that productivity is a function of the power of funding.
There are important policy implications of the power law as it applies to research. In cases in which power laws apply, extreme observations are not as infrequent as one might think. In other words, extreme events may be uncommon, but they are not rare. Extreme events in biomedical science certainly happen – from the discoveries of evolution and the genetic code to the development of vaccines that have saved millions (if not billions) of lives to the findings of the transformative Women’s Health Initiative Trial to the more recent developments in targeted treatments and immunotherapy for cancer. Because extreme events happen more often than we might think, the best way to maximize the chance of such extreme transformative discoveries is, as some thought leaders have argued, to do all we can to fund as many scientists as possible. We cannot predict where or when the next great discovery will happen, but we can predict that if we fund more scientists or more projects we increase our ability to maximize the number of great discoveries as a function of the dollars we invest.