SMI 2022 - Recent advances in statistical learning methods for imaging data

Friday, May 27, 2022 • 4:50–6:05 pm (CT) Light Hall, Room 208

Organizer: GuanNan Wang, College of William and Mary

Chair: Andrew Chen, University of Pennsylvania

 

Sparse learning and structure identification for ultra-high-dimensional image-on-scalar regression

Lily Wang, George Mason University

Xinyi Li, Clemson University

Huixia Judy Wang, George Washington University

We consider high-dimensional image-on-scalar regression, where the spatial heterogeneity of covariate effects on imaging responses is investigated via a flexible partially linear spatially varying coefficient model. To tackle the challenges of spatial smoothing over the imaging response’s complex domain consisting of regions of interest, we approximate the spatially varying coefficient functions via bivariate spline functions over triangulation. We first study estimation when the active constant coefficients and varying coefficient functions are known in advance. We then further develop a unified approach for simultaneous sparse learning and model structure identification in the presence of ultrahigh-dimensional covariates. Our method can identify zero, nonzero constant, and spatially varying components correctly and efficiently. The estimators of constant coefficients and varying coefficient functions are consistent and asymptotically normal for constant coefficient estimators. The method is evaluated by Monte Carlo simulation studies and applied to a dataset provided by the Alzheimer’s Disease Neuroimaging Initiative.

 

Functional data fusion of PM2.5 observations and satellite AOD measurements

Yueying Wang, Iowa State University

Zhengyuan Zhu, Iowa State University

Li Wang, Iowa State University

Monitoring and forecasting PM2.5 is important for countries where air pollution is a serious public health issue. Current PM2.5 forecasts are mainly based on observations from monitoring stations with high temporal frequency but sparse and uneven spatial distribution. On the contrary, Aerosol Optical Depth (AOD) data from satellites such as MODIS has better spatial coverage but low temporal frequency. The fusion of monitoring stations’ PM2.5 observations and the AOD data from satellites can provide hourly high-resolution PM2.5 concentration information, which can be beneficial to epidemiological studies of the effect of PM2.5. In this talk, we introduce a novel data fusion framework using functional data analysis tools to incorporate information from AOD images. Efficient algorithms are developed to estimate the non-stationary mean and covariance structure borrowing the strength of bivariate spline smoothing and the principal analysis by conditional estimation (PACE) algorithm. The estimates from the AOD data are used to improve the spatial prediction of PM2.5. A point-wise prediction interval is also provided to quantify prediction uncertainty. The proposed method is applied to data in the Beijing area in China, and our analysis shows the proposed approach outperforms several existing data fusion methods.

 

Reduced-rank tensor-on-tensor regression and tensor-variate analysis of variance

Carlos Llosa-Vite, Iowa State University

Ranjan Maitra, Iowa State University

Fitting regression models with many multivariate responses and covariates can be challenging, but such responses and covariates sometimes have tensor-variate structure. We extend the classical multivariate regression model to exploit such structure in two ways: first, we impose four types of low-rank tensor formats on the regression coefficients. Second, we model the errors using the tensor-variate normal distribution that imposes a Kronecker separable format on the covariance matrix. We obtain maximum likelihood estimators via block-relaxation algorithms, and derive their computational complexity and asymptotic distributions. Our framework enables us to formulate tensor-variate analysis of variance (TANOVA) methodology. This methodology, when applied in a one-way TANOVA layout, enables us to identify cerebral regions significantly associated with the interaction of suicide attempters or non-attemptor ideators and positive-, negative- or death-connoting words in a functional Magnetic Resonance Imaging study. Another application uses three-way TANOVA on the Labeled Faces in the Wild image dataset to distinguish facial characteristics related to ethnic origin, age group and gender.

 

Return to the schedule page