All ETDs from UAB

Advisory Committee Chair

Da Yan

Advisory Committee Members

Tianyang Wang

Document Type


Date of Award


Degree Name by School

Doctor of Philosophy (PhD) College of Arts and Sciences


In the analysis of scientific and clinical texts, the extraction of named entities and their relevant information, such as modifiers, play a pivotal role. Recent advancements in natural language processing (NLP), particularly through the application of transfer learning from pre-trained Transformer models, have greatly enhanced the performance of entity extraction tasks. However, challenges persist with nested entities. This thesis investigates the impact of transfer learning on extracting nested entities and their modifiers, using Opioid Use Disorder (OUD) as a prototype. By adopting a multi-task training strategy, this work enhances the model's capacity to discern and categorize overlapping entities, a task that traditional transfer learning models often struggle with due to their single-focus training on flat entities. Moreover, entity modifiers, which can alter the semantics of entities extracted from clinical texts, are critical for interpreting clinical narratives accurately. Traditional models for identifying these modifiers often rely on regular expressions or feature weights, trained in isolation for each modifier. In contrast, this thesis proposes a novel, unified multi-task Transformer architecture that simultaneously learns and predicts various modifiers. The effectiveness of this approach is validated on the ShARe and OUD data sets, demonstrating state-of-the-art results and highlighting the potential of transfer learning between data sets with partially similar modifiers in clinical texts. This work extends into document-level entity relation extraction, enhancing the ability to understand and analyze the relationships between entities within scientific literature comprehensively. Furthermore, the thesis addresses the essential task of entity normalization - linking textual mentions to ontology concepts. Despite the challenges posed by the diverse expression of concepts and the complexity of ontology graphs, this work introduces a model that utilizes graph neural networks (GNN) to encode entity mentions and ontology concepts in a common hyperbolic space, aiming to enhance entity normalization performance in scientific and clinical texts.

Available for download on Tuesday, May 05, 2026