All ETDs from UAB

Advisory Committee Chair

Jeff M Szychowski

Advisory Committee Members

Gary R Cutter

Byron Jaeger

Richard Kennedy

Erik Roberson

Document Type


Date of Award


Degree Name by School

Doctor of Philosophy (PhD) School of Public Health


Advances in statistical learning models for prediction have led to broader application across a variety of disciplines, granting generalizations and adaptations that were previ-ously intractable even with advanced computational techniques. Among these is the al-lowance of correlated data with inherent paneled structure such as longitudinal or clus-tered data; adjustments which have already begun to be applied to a variety of supervised and unsupervised machine learning methods which had previously focused on cross-sec-tional data. These modifications have seen rudimentary testing in a number of applied disciplines where correlated data is commonly observed, including medical and clinical research. One field in particular that has garnered interest is Alzheimer’s disease and re-lated dementias. As this disorder is characterized by a prolonged and progressive disease course with an extensive variety of potential biomarkers, its feature-dense datasets with repeated patient measures are well suited for applications of machine learning prediction while utilizing longitudinal modifications. While some novel adaptations of longitudinal machine learning methods have already been tested in the realm of Alzheimer’s disease, there has not yet been a comprehensive evaluation to compare these techniques against each other or against widely accepted standards such as traditional inferential techniques like mixed-effects regression. Nor has there been rigorous investigation into how sub-ject-specific effects can impact the error and bias of these predictions and the distinctions which may arise when developing entire temporal profiles as compared to the forecasting iv of future observations while leveraging previously observed data. This dissertation ad-dresses these deficiencies in the literature by directly comparing a variety of machine learning techniques with longitudinal adaptations against each other and reference stand-ards using a large, multi-study Alzheimer’s disease meta-database as well as assessing the role of subject-specific effects using synthetic data. This study is especially compre-hensive, considering both continuous and categorical outcomes as well as differences when generating whole profiles de novo or forecasting of future observations based on prior sequences. With its emphasis on longitudinal data, this study considers not only predictive capacity for unobserved data using population-level characteristics, but also prediction of future observations using a variety of subject-specific effects.

Included in

Public Health Commons