Advisory Committee Chair
Nengjun Yi
Document Type
Dissertation
Date of Award
2024
Degree Name by School
Doctor of Philosophy (PhD) School of Public Health
Abstract
This dissertation focuses on developing predictive modeling using data from two areas: clinical data and microbiome data. Compositional data in real life are mainly represented by relative proportions. With the advancement of next-generation sequencing (NGS) technology, researchers can now collect a large volume of metagenomic sequencing data, which is valuable for investigating associations between the microbiome and host diseases. Current methods for dealing with such data are either constrained by generalization or limited by application. In the first part, to address this, we propose Bayesian compositional generalized linear models for analyzing microbiome data (BCGLM). This model incorporates a structured regularized horseshoe prior and a soft sum-to-zero restriction on coefficients through the prior distribution. Our proposed method outperforms existing methods, providing higher accuracy of coefficient estimates and lower mean squared prediction error. We also extend the proposed method to ordinal data, where the response has ordered possible values, and incorporate a network among taxa/samples to consider the phylogenetic relatedness among species and samples. In the third part, we introduced the Bayesian Compositional Generalized Linear Mixed Models for Analyzing Microbiome Data (BCGLMM) by combining a high-dimensional sparse modeling and generalized linear mixed models to address the challenge of a mixture of both larger and smaller effects. Using this method, with a sparsity-inducing prior (the structured regularized horseshoe prior), we effectively select for phylogenetically related moderate effects. The random effect term efficiently captures sample-related minor effects by incorporating sample similarities within its variance-covariance matrix. This allows for the identification of both substantial taxa effects and the collective impact of numerous minor taxa factors.
Recommended Citation
Zhang, Li, "Predictive Modeling Using Clinical Data And Compositional Microbiome Data" (2024). All ETDs from UAB. 3862.
https://digitalcommons.library.uab.edu/etd-collection/3862