9 Comments
Our modernized RePORTER site features a new application programming interface (API) that makes it easier to find, access, and reuse the grants data you need. This is a welcome step up from downloading bulk records through ExPORTER, especially when you only need a subset of records. And, since we will be retiring the weekly ExPORTER project file updates on October 1, 2021, we are encouraging you to familiarize yourself with what the RePORTER API and modernized RePORTER site have to offer.
Why retire the weekly ExPORTER project updates? Simply put, the weekly project files are out-of-date with RePORTER right after they are produced. If there is a new principal investigator, or new funding amounts, or any other changes, they will be reflected in RePORTER and the API as soon as we do our weekly data refresh. These changes are not, however, found in the static ExPORTER files.
Come October, users will have the option to:
- Download annually updated ExPORTER project files (we are only retiring the weekly updates)
- Use the search and export features of modernized RePORTER to easily and conveniently download a limited subset of updated projects
- Consider RePORTER’s API for a more automated, programmatic way to retrieve the same data elements as the weekly ExPORTER project files
The API (Figure 1) allows users to review comprehensive and current grants data in a timelier and less resource-intensive way than the weekly ExPORTER files. The sample queries on the API documentation site will make it easier to access only the data needed, rather than juggling dozens of large ExPORTER files. The files generated from the API can be easily analyzed with standard software, providing more flexibility for different needs.
We enjoy seeing how folks are already using the RePORTER API, such as on the NIH’s COVID-19 website to track funded projects (Figure 2).
API users will be able to pull precisely the desired information made available through RePORTER on funding, recipient organizations, abstracts, public health significance, related publications and other research outputs, and even more key data elements related to the awarded projects. Also, this API complements a suite of other available NIH APIs that help systems to retrieve information on clinical trials and what other science agencies are funding.
We recognize many researchers, institutions, professional societies, and commercial entities have come to rely on ExPORTER files and have used them for a long time to track grant funding. We also know that in some cases the API may be less intuitive compared to simply downloading large files with a single click. To address these concerns, resources are available to ease the transition to the API, which include documentation, instructions, and sample queries. You can also check out this available R code (developed by a colleague with the NIH Library) to help you get started with the RePORTER API too.
Your feedback will be helpful as we develop additional API functionality and training materials. We look forward to hearing from you and how the new API will enable new analyses or workflows as well.
Hi, I am wondering if you have sample code for data extraction in Python.
We would encourage you reach out to [email protected] who can assist with your question (note: they do not provide specific programming support). Also, there is a sample query document showing how to run API queries using the curl command, which can be performed in Python or other languages: https://dev.exporter.nih.gov/files/Sample%20RePORTER%20API%20Request%20Matching%20ExPORTER%20Project%20Files.docx
The NIH RePORT team does have sample Python code to help with this issue, and are considering adding the code to the API documentation. Until that time, we would encourage you email [email protected] for the code.
I was having an issue with using the cURL to access the data through python as well. I emailed that email earlier but I was hoping I could get pointed in the right direction here in addition to my email.
Hi, do you plan to expose references to patents via API too?
Unfortunately patent id data are not retrievable through API, but you can export those data from modernized RePORTER RePORT ⟩ RePORTER (nih.gov). Some projects link to PubMed, and patents, which are great sources to dive down. However, in API you are able to get PubMed id but not patent id.
Hello,
Is there any way to get data for a period older than the weekly update? For example, I want to access projects that were published from Oct 2021 to now. Even if I filter for fiscal year 2021 in the API, the offset is limited to 10000 records and I cannot access all the 78914 records. Is any way to overcome this issue?
Thanks!
I’m facing the same issue. Also, it would be nice a fiscalYear range, eg 2018+. And yeh, this offset limit is also an issue here.
What is the difference between what Blue Ridge compiles and what is on this site (data-wise)?
I’ve been working with your dataset for 15 yrs now at the Univ of Pittsburgh and appreciated the work put into this.
Thanks
Dan