Mapping Publications to Grants


Publications are one of the important products of NIH research grants, and authors are required to cite their NIH support in their publications. But as many of us have experienced, the format of grant numbers varies greatly. Some authors use institutional grant tracking numbers, others abbreviate the NIH number, and many other permeations arise. It almost becomes a “Where’s Waldo?” of the grant number world! All this makes it very difficult to directly link NIH grants to their publications, but the combination of public access policies and an NIH software development effort have recently improved this problem for those publications available via the NIH National Library of Medicine.

In 2001, NIH developers created the first version of a database known as SPIRES (Scientific Publication Information Retrieval and Evaluation System). In a nutshell, SPIRES maps publications to NIH grants. In practice, creating the means to do this was not simple at all. SPIRES uses automated text manipulation methods to extract and reformat grant numbers cited in publications. The reformatted numbers are then compared to NIH grant data, and the “goodness” of the match is rated by the SPIRES system. 

A decade later, SPIRES is now a mature database that maps 30 years of publications from PubMed to NIH grants. The results display through a number of internal NIH systems and to the public on the RePORT website. Because we often demonstrate outcomes of NIH support through publications arising from NIH grants, SPIRES has proven to be a critical component to accurately measure impact. More details about the SPIRES system and an example of what can be learned from publication data are available in the paper “Metrics associated with NIH funding: a high-level view” in the Journal of the American Medical Informatics Association. External Web Site Policy


  1. PIs are motivated, quite strongly, to overreport. I.e. Multiple-award labs crediting every active grant for each paper whether justified or not. This is so that “productivity” looks strong on competing continuation proposals.

    I assume it is in the NIH interest as well since it boosts your stats.

    The trouble is that this is another way junior and/or smaller labs are disadvantaged on review. And, correspondingly another driver of PIs seeking to hold more grants. Perhaps this database could be used to make the progress report on competing continuations an automatic report. That made the multiple-grant attribution on papers much clearer to reviewers.

  2. In Fauci’s 30 years of HIV retrospective yesterday, he quoted an appeoximate 225k publications generated from $45B of NIH HIV/AIDS research investment. This equates to about 1 publication per $200k. Can’t waiot to investigate this further and compare to productivity in other domains.

    1. @Mike Smith: Assuming the average 5 year modular award is $250K per year plus, say 50% overhead, that would be $375K per year for a total of $1875K, which would mean an average of 9+ publications per award. In life sciences this is not as unproductive as your comment seems to imply.

    2. There is another problem with accounting for publication costs that someone pointed out to me. There appears to be two economies operating in biomedical research-that of the rich and that of the poor. For example, there are salaries of researchers in the 300-500K range (mainly physicians and administrators MD/MBA), and there are salaries in the 40-100K range (mainly bench scientists, and teachers/professors BS, PhD). These salaries make up a good portion of the cost of research and publications and can produce a large range of publication/research costs. Someone may argue that there are NIH limits on salaries (199K?), but there are ways in accounting to get around specific categorical limits through overhead costs etc. One can also understand the growing forces the higher salaries are producing to keep salaries and numbers of participants low for those doing the science on a day-to day basis, and the decline in interest in science careers by the better qualified students. One probably needs to better justify the costs of the high salary personnel on research projects otherwise pressures on publications etc. will be dropped on the day-to-day researchers who already get less of a share of the pie than they deserve.

  3. Agree with “drug-monkey’s” comment above. The number of grants acknowledged per publication is misleading. Given that there should be no overlap among grants, a publication with 5 grants acknowledged should be counted as 0.2 publication for each particular grant’s progress report, while a publication with only 1 grant acknowledged is counted as 1.0 publication for that grant. As it is, there are so many grants acknowledged in publications resulting in the overinflation of the number of publications produced from a grant – which then gives an erroneous productivity measure. Often times one sees PPGs and RO1 grants acknowledged in the same publication. Overlap accountability should cover publications – and not just confined to the “other resources page”. If more than one grant is rightfully acknowledged, a brief explanation should be supplied in the publication, as well as for grant reviewers with the appropriate % a particular grant contributed to said multi-grant publication. In this way, the playing field is leveled for smaller labs and young investigators.

    1. I do not know how common it is for PIs to cite many grants for unscrupulous reasons, but I would like to point out that the practice of citing many grants is quite valid and necessary in interdisciplinary work. This is true not just in the obvious case where the biology PI cites one grant and the computer science/statistics PI cites another but also in cases where one grant supported algorithm development and another of the PI’s grants supported software development and maintenance, a common situation in my particular field. Or perhaps in a biology laboratory, one grant supported the development of a method and the other supported its application to a particular problem.

      Forcing credit to be divided evenly among all grants cited is probably not the right answer, just as it is not appropriate to evenly divide credit for multi-author papers. To institute systems like these would likely discourage people from including their interdisciplinary collaborators as authors/grant-acknowledgers, to the detriment of future work.

      Your proposal that PIs be forced to assign proportions of credit is not unreasonable, though in practice (for both authorship and grant-acknowledgement purposes) being forced to quantify different groups’ contributions to work really creates a headache. Selecting authors and their order is a tough enough job without having to assign credit as a percentage. Assigning a % contribution from each grant would pose similar problems where the contribution from each laboratory to the final paper must be quantified. This overall problem of attributing credit is certainly a tough one; it seems there are no satisfying answers!

  4. Linking grants to publications is very important and especially helpful for reviewers of new NIH proposals. Sally, could you point us to resources to search for what papers were credited to a particular grant? I am aware of NIH RePORTER, but was told by CSR that this database only shows publications linked to current funded grants, and that they were unaware of an efficient way to search for papers that resulted from grants that have since completed. It seems fundamentally important for an NIH proposal reviewer to have access to a PI’s track record on prior awards.

    1. RePORTER provides access to PubMed records for all publications since 1985 that we have been able to successfully link to an NIH grant (based on the acknowledgements section of each paper). This includes both currently funded grants and grants that have been completed. By default, RePORTER searches only active grants, but this can be changed by changing the fiscal year query field. Note that RePORTER often won’t provide a complete publication record for a PI, as NIH support may not always be acknowledged or an incomplete or incorrect grant number is cited. However, under recent NIH public access policies that require deposition of all NIH-supported research publications in PubMed Central, we expect the quality of RePORTER publication data to improve over time.

Before submitting your comment, please review our blog comment policies.

Leave a Reply

Your email address will not be published. Required fields are marked *