Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Statistical Learning for the Spectral Analysis of Time Series Data

Tuft, Marie (2020) Statistical Learning for the Spectral Analysis of Time Series Data. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Submitted Version

Download (963kB) | Preview


Spectral analysis of biological processes poses a wide variety of complications. Statistical learning techniques in both the frequentist and Bayesian frameworks are required overcome the unique and varied challenges that exist in analyzing these data in a meaningful way. This dissertation presents new methodologies to address problems in multivariate stationary and univariate nonstationary time series analysis.

The first method is motivated by the analysis of heart rate variability time series. Since it is nonstationary, it poses a unique challenge: localized, accurate and interpretable descriptions of both frequency and time are required. By reframing this question in a reduced-rank regression setting, we propose a novel approach that produces a low-dimensional, empirical basis that is localized in bands of time and frequency. To estimate this frequency-time basis, we apply penalized reduced rank regression with singular value decomposition to the localized discrete Fourier transform. An adaptive sparse fused lasso penalty is applied to the left and right singular vectors, resulting in low-dimensional measures that are interpretable as localized bands in time and frequency. We then apply this method to interpret the power spectrum of HRV measured on a single person over the course of a night.

The second method considers the analysis of high dimensional resting-state electroencephalography recorded on a group of first-episode psychosis subjects compared to a group of healthy controls. This analysis poses two challenges. First, estimating the spectral density matrix in a high dimensional setting. And second, incorporating covariates into the estimate of the spectral density. To address these, we use a Bayesian factor model which decomposes the Fourier transform of the time series into a matrix of factors and vector of factor loadings. The factor model is then embedded into a mixture model with covariate dependent mixture weights. The method is then applied to examine differences in the power spectrum for first-episode psychosis subjects vs healthy controls.

Public health significance: As collection methods for time series data becomes ubiquitous in biomedical research, there is an increasing need for statistical methodology that is robust enough to handle the complicated and potentially high dimensionality of the data while retaining the flexibility needed to answer real world questions of interest.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Tuft, Mariemarie.tuft@pitt.eduMAT199
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairKrafty, Robert
Committee MemberAnderson, Stewart
Committee MemberYouk, Ada
Committee MemberRothenberger, Scott
Date: 10 September 2020
Date Type: Publication
Defense Date: 23 July 2020
Approval Date: 10 September 2020
Submission Date: 13 July 2020
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Number of Pages: 94
Institution: University of Pittsburgh
Schools and Programs: School of Public Health > Biostatistics
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: spectral analysis, high dimensional, nonstationary time series, Bayesian mixture model, penalized regression, time series, multivariate time series
Date Deposited: 11 Sep 2020 02:37
Last Modified: 11 Sep 2020 02:37


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item