Advisory Committee Chair
Nengjun Yi
Advisory Committee Members
Byron C Jaeger
Dorothy Leann Long
Akm F Rahman
Michael E Seifert
Document Type
Thesis
Date of Award
2022
Degree Name by School
Doctor of Philosophy (PhD) School of Public Health
Abstract
There are proposals that extend classical additive models (AMs) to accommodate high-dimensional data (p >> n) using group sparse regularization. However, the sparse regularization may induce excess shrinkage when estimating smooth functions, damaging predictive performance. Moreover, most of these AMs consider an “all-in-allout” approach for functional selection, rendering them difficult to answer if nonlinear effects are necessary. While some Bayesian models can address these shortcomings, model fitting can create a new challenge, scalability. In this dissertation, we propose Bayesian hierarchical additive models to address the previous shortcomings: we consider the smoothing penalty for proper shrinkage of curve interpolation via reparameterization. A novel two-part spike-and-slab LASSO prior for smooth functions is developed to address the sparsity of signals while providing extra flexibility to select the linear or nonlinear components of smooth functions. The proposed prior applies to both the generalized additive model framework and the Cox proportional hazards framework. Computational-wise, we develop scalable and deterministic algorithms, including the EM-coordinate Descent algorithm, to alleviate the computational burden of fitting Bayesian models. Via Monte Carlo studies and real-world data applications, we demonstrate improved predictive and computational performance of the proposed Bayesian hierarchical additive model against state-ofthe- art models. To improve the accessibility of the proposed models, we offer a freely available software BHAM that implements Bayesian hierarchical additive models in the R programming environment. Specifically, the package includes functions to fit geniii eralized additive models and additive Cox proportional hazards models with the two-part spike-and-slab LASSO prior. In addition, it supplies other utility functions to construct additive function formulas in high-dimensional settings, select optimal models, summarize bi-level variable selection results, and visualize nonlinear effects. Interested readers can access the software BHAM via the public GitHub repository at https://github.com/boyiguo1/BHAM. In conclusion, the dissertation contributes to the current literature on flexible modeling of complex signals for high-dimensional data analysis. The proposed models can be widely applied in molecular and clinical data analysis to inform biomedical research.
Recommended Citation
Guo, Boyi, "Spike-And-Slab Additive Models and Scalable Algorithms for High Dimensional Data Analysis" (2022). All ETDs from UAB. 304.
https://digitalcommons.library.uab.edu/etd-collection/304