Document Type
Presentation
Abstract
Many academic libraries are compiling dataset catalogs using a variety of approaches to gather metadata for dataset records which may be hosted on one of several platforms. Our approach is to develop Python code and leverage APIs to locate relevant datasets, then harvest and clean the associated metadata. The cleaned metadata is then manually curated and enhanced before being added to the Dataset Catalog. We employ the hosted institutional repository Digital Commons to display the dataset records. Digital Commons is a widely used platform which requires no coding background and little technical overhead. Combining API harvesting and automated data cleaning with the batch upload feature of Digital Commons yields efficient ingestion of many dataset records, allowing us to prioritize manual curation and enhancement of the dataset metadata. This methodology is ideal for institutions unable to dedicate large teams of specialized personnel or significant amounts of technical resources to launching a dataset catalog. We hope to eventually make the code we have developed available for reuse in other institutions. The University of Alabama at Birmingham’s Research Data Catalog currently contains over 100 dataset records from multiple repositories. This project was supported through the Data Services Continuing Professional Education (DSCPE) program.
Presented at the Data Discovery Collaboration (DDC) meeting on April 4, 2025
Publication Date
4-4-2025
College or School
UAB Libraries
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Recommended Citation
Hertz, Marla; Reese, Amy; and Warner, Claire, "Data worth saving: Building a lightweight dataset catalog in Digital Commons" (2025). Libraries Professional Work. 16.
https://digitalcommons.library.uab.edu/libraries-pw/16
Video (MP4)
DDC-UAB-DataCatalogTalk-audio.m4a (31764 kB)
Audio only version (MA4)
DDC-UAB-DataCatalogTalk-transcript.txt (45 kB)
Session transcript
20250404_DDC_UAB data catalog.pdf (17605 kB)
Presentation (PDF)