Statistical Learning for the Spectral Analysis of Time Series DataTuft, Marie (2020) Statistical Learning for the Spectral Analysis of Time Series Data. Doctoral Dissertation, University of Pittsburgh. (Unpublished)
AbstractSpectral analysis of biological processes poses a wide variety of complications. Statistical learning techniques in both the frequentist and Bayesian frameworks are required overcome the unique and varied challenges that exist in analyzing these data in a meaningful way. This dissertation presents new methodologies to address problems in multivariate stationary and univariate nonstationary time series analysis. The first method is motivated by the analysis of heart rate variability time series. Since it is nonstationary, it poses a unique challenge: localized, accurate and interpretable descriptions of both frequency and time are required. By reframing this question in a reduced-rank regression setting, we propose a novel approach that produces a low-dimensional, empirical basis that is localized in bands of time and frequency. To estimate this frequency-time basis, we apply penalized reduced rank regression with singular value decomposition to the localized discrete Fourier transform. An adaptive sparse fused lasso penalty is applied to the left and right singular vectors, resulting in low-dimensional measures that are interpretable as localized bands in time and frequency. We then apply this method to interpret the power spectrum of HRV measured on a single person over the course of a night. The second method considers the analysis of high dimensional resting-state electroencephalography recorded on a group of first-episode psychosis subjects compared to a group of healthy controls. This analysis poses two challenges. First, estimating the spectral density matrix in a high dimensional setting. And second, incorporating covariates into the estimate of the spectral density. To address these, we use a Bayesian factor model which decomposes the Fourier transform of the time series into a matrix of factors and vector of factor loadings. The factor model is then embedded into a mixture model with covariate dependent mixture weights. The method is then applied to examine differences in the power spectrum for first-episode psychosis subjects vs healthy controls. Public health significance: As collection methods for time series data becomes ubiquitous in biomedical research, there is an increasing need for statistical methodology that is robust enough to handle the complicated and potentially high dimensionality of the data while retaining the flexibility needed to answer real world questions of interest. Share
Details
MetricsMonthly Views for the past 3 yearsPlum AnalyticsActions (login required)
|