Seeking Input on the Need to Enhance Access to NIH Grants Data


NIH has long been committed to transparency into who and what we fund. We have previously discussed the value of freely-available web tools that allow you to gain insight into NIH funding decisions. Award data available via RePORT and RePORTER, for instance, include non-sensitive information such as awardee institution, principal investigator, funding levels, research abstracts, as well as associated publications, patents, and other project outcomes. Better yet, if you want to see all of these data all at once, then ExPORTER allows you to download over 25 years’ worth of such non-sensitive NIH grant award data.

Researchers have used this grant information in creative and thought-provoking ways to explore NIH funding decisions. For example, both Fang, Bowen, and Casadevall as well as Li and Agha analyzed post-award research productivity according to pre-award peer review scores. Li, Azoulay, and Sampat linked publications resulting from NIH awards to patents. Boris et al used RePORTER data to verify self-reported awards in the dermatology field.  Cleary et al used RePORTER data to show that all recent new drug approvals were in some meaningful way linked to NIH funding.  And as I wrote in this 2017 post, Katz and Matter looked at some NIH data and described what they saw as inequality and stasis in the biomedical enterprise.

The data available through RePORT are quite powerful in their own right. However, compelling arguments exist for why researchers outside NIH should have access to even more information associated with the grants process. In addition to the non-sensitive data, NIH maintains sensitive information   collected via the grants process in its internal research administration systems. Such data includes information on peer review outcomes, progress reports, and demographics of individuals listed in NIH grant applications. 

As part of a Request for Information (RFI) described here and other future engagements, NIH is considering which categories of this sensitive data may be shared with researchers in compliance with applicable laws, while safeguarding sensitive, personally-identifiable, and confidential information (NOT-OD-19-085). NIH is beginning the process of exploring the costs and benefits of providing approved research organizations-controlled access to such structured, de-identifiable NIH administrative and scientific information in a formal and controlled way—through a secure data enclave.

Over the years, institutions, professional societies, advocates, and researchers interested in the science of science and innovation policy have requested access to more of these sensitive NIH data. Under certain circumstances, NIH allows researchers to enter into special data use agreements or other contractual arrangements to access these data for specific research purposes. As an example, NIH issued a contract that allowed Ginther et al to look at demographic information of researchers identified on applications, which they later used to compare with receipt of a major NIH award. It is important to remember though, that even when we permit such access, we remain dedicated to safeguarding your sensitive, personally-identifiable, and confidential information (please listen to the related NIH All About Grants podcast on this topic for more: MP3  / Transcript , 7 minutes).

Last December, the Advisory Committee to the NIH Director’s Next Generation Working Group recommended increased access to NIH administrative data. These administrative data have the capacity of “empowering career decision making through the availability of NIH data” and increasing accessibility of internal data for researchers studying the biomedical workforce” (see Theme 5.1 of the work group recommendations to NIH here).

Some federal agencies host unique environments allowing researchers access to agency information. The Centers for Medicare and Medicaid Services allows users to obtain research, data, and statistics on topics like actuary studies, compliance monitoring, and claims.Moreover, the Census Bureau makes certain administrative data available to reduce respondent burden and enhance analyses on changes in the U.S. population, demographics, economy, and social conditions. Though such avenues exist, concerns still remain around data security, personal privacy, affiliated costs to manage the platforms, how physical or virtual environments are controlled, and the overall need to know.

As noted earlier, we recently issued an RFI that seeks community input on considerations for securing these data, where they may be accessed, requirements for a research plan, and procedures for exporting information outside the enclave. Moreover, we hope to better understand what biomedical and/or behavioral research questions may be answered, if the enclave should be physical or virtual, how many seats an organization may be interested in using, what policies are needed to secure the information, and proposed steps to take in case of data breaches.

All RFI responses must be submitted electronically here by Wednesday, May 30, 2019.

This RFI is our attempt to gauge initial interest in a data enclave and identify issues that we will need to think through. This is your opportunity to tell us if this is something you would use, if you have concerns to share, and suggestions to improve the idea. If the response is positive, we fully expect to continue to engage the community to help us refine the idea.


  1. Sunshine is the best disinfectant, and there is a lot to clean up when it comes to the Office of Extramural Research and how grants are awarded. Open up your books. There is very little information that is “sensitive”. Don’t use that as an excuse to work in secrecy.

    if you provide more information to the public, you will find that the public is your ally in scrutiny. And when a whistleblower in the public provides you legitimate information, have the backbone to act on it, no matter who the target is. For example, there is tremendous duplication in grants awarded to the same investigator. When that is pointed out to you, take action. It is your responsibility to do so.

  2. NIH has awarded grants to the same investigator that have tremendous overlap. Are NIH officials interested in learning about this, and doing the right thing? I have pointed it out to them, and provided evidence of duplication, but NIH has turned a blind eye. If you are true stewards of public funds, at least investigate these allegations.

  3. Reviewers don’t fund duplicate research and love to find even a hint of duplication as a reason to not fund a grant. If you had ever submitted a grant or ever been on a study section this would be abundantly clear. What looks like duplication from the outside is not information. Everyone that writes successful grants knows that nothing duplicate or even suggestive of duplication gets funded. If you write a grant in even vaguely the same general area, you have to spend time throughout the grant proving to reviewers that this research is not in any way duplication anything funded or proposed. Money is way too tight and there are way too many good grants out there for that to fly.

    1. Not true. I have submitted and reviewed many grants and I know there are many holes in the current system.

      Reviewers can’t scrutinize prior funded applications to see if there is overlap/duplication because they don’t even have access to them. The only way they can compare an applicant’s prior funded applications is looking at publicly available abstracts on RePorter but there is substantial delay in making these abstracts public, so a reviewer may have no clue about what other applications an applicant may have in review or funded.

      Do Program Officers cross-check each funded application from the same applicant? They should but I doubt they do. Do they scrutinize each application from the same applicant to see if there is overlap, or do they simply take the applicant’s word in the Just in Time that there is “no overlap”? I bet they don’t carefully verify that there is truly no overlap.

      1. There are checks and validations in place throughout our receipt, peer review and pre-award processes to check for duplicate submissions and awards in our system. These checks include data mining tools in our systems designed to identify and prevent duplicative funding. In cases where scientific overlap is identified, we take actions to address it prior to award which may include the renegotiation or removal of overlapping aims. When scientific overlap is identified postaward, NIH may take other actions such as terminating the grant award with the significant or complete overlap and recovery of all costs. There may be times when abstracts in RePORTER appear to be very similar when, in fact, there are substantive scientific differences.

  4. There is another dimension to privacy. I expect that, no matter who NIH permits to access the sensitive data, there will be people willing to hack its access. And this will likely be much easier on a PI’s system (university, foundation, commercial, or otherwise) than on NIH’s.

    By permitting this access, you risk both personal privacy and economic damage. It isn’t all about, say, personal health information. How do you propose to protect such data, and assure that the PI and her organization are realistically CAPABLE of protecting of it, against deliberate hacking?

Before submitting your comment, please review our blog comment policies.

Leave a Reply

Your email address will not be published. Required fields are marked *