Document Type
Poster
Publication Title
Medical Library Association 2025 Conference
Abstract
Many academic libraries maintain dataset catalogs to make the research data produced by their institutions more findable, accessible, interoperable, and reusable (FAIR). These catalogs use a variety of approaches to gather metadata for dataset records and may be hosted on one of several platforms. Our approach is to develop Python code and leverage APIs to locate relevant datasets, then harvest and clean the associated metadata. The cleaned metadata is manually curated and enhanced before being added to the Dataset Catalog. We employ the hosted institutional repository Digital Commons to display the dataset records. Combining API harvesting and automated data cleaning with the batch upload feature of Digital Commons yields efficient ingestion of many dataset records, allowing us to prioritize manual curation and enhancement of the dataset metadata to make it FAIRer. This methodology is ideal for institutions unable to dedicate a large personnel team or significant technical resources to launching a Dataset Catalog. The resulting dataset catalog currently contains over 100 records from multiple repositories. This project was supported through the Data Services Continuing Professional Education (DSCPE) program.
Publication Date
5-1-2025
College or School
UAB Libraries
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Supplemental Associated Link
https://site.pheedloop.com/event/mla25/sessions/SESJL6HY6AZ84FXIH
Recommended Citation
Warner, Claire; Reese, Amy; and Hertz, Marla, "Development of a Lightweight Dataset Catalog with Digital Commons" (2025). Libraries Professional Work. 32.
https://digitalcommons.library.uab.edu/libraries-pw/32
QR code version
2025 10 09 MLA Liberty Chapter Data Catalog Poster Online Version.pdf (421 kB)
Liberty Chapter version
Comments
Liberty Chapter version presented on October 9, 2025