Unmasking Potential: Introducing New Beta Version Tool for Biomedical Dataset Discovery


Guest post by Dianne Babski, Director of the NLM User Services and Collections Division (USCD), and Peter Seibert, USCD Librarian. Originally posted on the NLM Musings from the Mezzanine blog.

Peter Seibert, Librarian, User Services and Collections Division, NLM
Dianne Babski, Director, User Services and Collections Division, NLM

In a world of rapidly changing digital expectations, new formats to access and store information, and a dynamic biomedical landscape, users want to connect to data across an abundant, widely available, and growing ecosystem of biomedical research with one click. That is the future we are working to create by leveling up our dataset discovery technology to better understand user expectations and enhance the user experience.

To bring you closer to that one-click reality, National Library of Medicine (NLM) is excited to announce the launch of a beta version of a new online tool, the Dataset Catalog.

Search, Find, and Connect Biomedical Datasets

The Dataset Catalog is a catalog of biomedical datasets from publicly available repositories. The tool is designed to help improve the discoverability and reuse of research data by making it easier for users to find and connect biomedical datasets. This functionality aligns with NIH’s efforts to make available to the public the results of research it supports and conducts. Bringing disparate metadata into a standardized format empowers researchers to share and discover data in a broader environment and create relationships that might otherwise not be apparent.

Adhering to FAIR (Findable, Accessible, Interoperable, and Reusable) principles, the Dataset Catalog is an online, “all-in-one” tool that allows users to navigate among biomedical datasets by linking descriptive data. The system is modeled on the ease of use of PubMed, and like PubMed, it provides links out to datasets. So you could think of the Dataset Catalog as the “PubMed of datasets”!

How It Works

The Dataset Catalog is powered by an innovative NLM data model called the DATaset Metadata Model, or DATMM. Describing data in datasets and repositories, DATMM allows data to be interpreted and connected by computers across the biomedical ecosystem. DATMM, together with the Dataset Catalog, enhances access to and discovery of biomedical datasets through federated web search, thereby accelerating scientific research. This supports NIH’s responsible data management and sharing policies and practices by enabling more efficient validation of research results, providing access to high-value datasets, and promoting data reuse for future research studies. In this beta phase, users can search datasets from four repositories with limited functionality.

Your Feedback Matters!

We encourage you to check out the beta version of the Dataset Catalog and DATMM and to share feedback by clicking the vertical blue “Give Feedback” button on the right-hand side of the Dataset Catalog web pages. NLM will evaluate feedback obtained during this six-month beta phase to inform future product development and expansion, such as adding more repositories and functionality.

Be sure to let us know what would be most helpful for you to find the data you need to make new discoveries!

Learn More

If you are interested in learning more, NLM will host virtual office hours on Thursday, April 11, at 2:00 p.m. Eastern Time. Our team will demonstrate the tool’s functionality and features. It’s also another way to share your feedback. Click here to learn more and register: https://www.nnlm.gov/training/class/nlm-office-hours-dataset-catalog.


  1. There is no “vertical blue ‘Give Feedback’ button” on any of the Dataset Catalog web pages. Where is it?

    1. There was a slight glitch with this feature. It is now working and we would be very interested in any feedback you may have.
      Peter Seibert

  2. Here’s something to pass along: How do we get people like heavily-worked grad students, who generate a great deal of data funded by NIH grants, to use this as-yet-apparently-undefined metadata format? Where is it laid out in a way that is comprehensible to people that are not used to reading white papers, etc.? This looks like some very late and very post hoc catch-up. We needed a developed system in place before NIH saddled us with mandates.

Before submitting your comment, please review our blog comment policies.

Leave a Reply

Your email address will not be published. Required fields are marked *