Experimental Design for Unbalanced Data Involving a Two level Logistic Model

Chen, Huanyu (2007) Experimental Design for Unbalanced Data Involving a Two level Logistic Model. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Preview

PDF
Primary Text
Download (1MB) | Preview

Abstract

The multilevel logistic model is used to analyze hierarchical data with binary outcomes, to detect variation both between and within clusters. I extended explicit variance formulae for a fixed effect in two level model for balanced binary data to account for imbalance both between and within clusters. The derivation of the variance is based on a linearization of the two level logistic model using first order marginal quasilikelihood (MQL1) estimation. In a simulation study, I used second order propensity quasilikelihood (PQL2) estimation to collaborate the accuracy of the analytic variance formula based on the observed racial distribution in a multi-center study of racial disparities. Using the site specific racial distributions, I simulated the log odds ratio for black race that could be detected with 80% power. These methods are illustrated in the context of a multi-center study of racial disparities in 30-day mortality in the Veterans Affairs (VA) Healthcare System, where the racial distributions are dramatically unbalanced across the 149 sites. We also consider a subset of 42 sites that include a majority of the black hospitalizations. The same analytic variance is obtained when one has either equal numbers of observations per site and/or a constant proportion of black veterans across sites. The observed racial imbalance both within and across sites increases the variance of the race coefficient more in the Random Coefficient (RC) model than in the random intercept (RI) model. Compared to PQL2, the analytic variances using MQL1 are, severely downwardly biased with smaller variance components. The simulation variances are virtually identical to the analytic variances for these data. For a given power, somewhat smaller log odds ratios can be detected in the RI model than in the RC model. The derived formulas provide a basis for planning multi-center studies when a predictor of primary importance is highly imbalanced both between and within sites. In studies of racial disparities in health care, the site-specific population distributions are often known from administrative data. The public health relevance of this work is that these methods for unbalanced data may facilitate more effective planning of multi-center studies of racial disparities.

Citation/Export:
Social Networking:	Share \|

Details

Item Type:

University of Pittsburgh ETD

Status:

Unpublished

Creators/Authors:

Creators	Email	Pitt Username	ORCID
Chen, Huanyu	huc6@pitt.edu, chenhy98@hotmail.com	HUC6

ETD Committee:

Title	Member	Email Address	Pitt Username
Committee Chair	Stone, Roslyn A	roslyn@pitt.edu	ROSLYN
Committee Member	Jeong, Jong-Hyeon	jeong@nsabp.pitt.edu	JJEONG
Committee Member	Fine, Michael J	Michael.Fine@va.gov
Committee Member	Sharma, Ravi K	rks1946@pitt.edu	RKS1946
Committee Member	Mazumdar, Sati	maz1@pitt.edu	MAZ1

Date:

21 June 2007

Date Type:

Completion

Defense Date:

23 April 2007

Approval Date:

21 June 2007

Submission Date:

13 April 2007

Access Restriction:

5 year -- Restrict access to University of Pittsburgh for a period of 5 years.

Institution:

University of Pittsburgh

Schools and Programs:

School of Public Health > Biostatistics

Degree:

PhD - Doctor of Philosophy

Thesis Type:

Doctoral Dissertation

Refereed:

Yes

Uncontrolled Keywords:

random coefficient model; first order marginal quasi-likelihood estimation; health service research; random intercept model; second order propensity quasi-likelihood estimat; racial disparities

Other ID:

http://etd.library.pitt.edu/ETD/available/etd-04132007-121242/, etd-04132007-121242

Date Deposited:

10 Nov 2011 19:37

Last Modified:

15 Nov 2016 13:40

URI:

http://d-scholarship.pitt.edu/id/eprint/7108

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item

My Account

Search

Browse

Information

Experimental Design for Unbalanced Data Involving a Two level Logistic Model

Abstract

Share

Details

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

Connect with us

Send Comments or Questions

Feeds