Principal investigator: Aiyi Liu, Ph.D.
The classical approach to conducting statistical inference on a distribution is based on a random sample from the distribution. Practical concerns, such as cost of a study, often lead to the observation of the sum (or average) of a group of random sample from the targeted distribution. This strategy has been used frequently in microarray designs and biomarkers of exposure in epidemiology (Design and Analysis of Pooled Blood Samples). These pooled data, i.e. the sums, follow a distribution that is the convolution of the targeted one. In regression models, these sums are either treated as independent variable (covariate) or as dependent variable.
Up-to-date methods for inference on the targeted distribution (or the regression model) based on the pooled data are scarce. This research focuses on developing semi- and non-parametric inference methods based on pooled data, inference of a general regression model (generalized linear model, survival model, etc.) with pooled dependent or independent variables, and their applications to gene microarray analysis and biomarkers of exposure in epidemiology.
DESPR Collaborators
· Enrique F. Schisterman, Ph.D.
· Mi-Xia Wu, Ph.D.
Selected Publications
Bondell H, Liu A, & Schisterman EF. (2007). Statistical inference based on pooled data: a moment-based estimating equation approach. Journal of Applied Statistics, 34:129-140.
Schisterman EF, Perkins N, Liu A, & Bondell H. (2005). Optimal cut-point and its corresponding Youden Index to discriminate individuals using pooled blood samples. Epidemiology, 16(1):73-81. [Abstract]
Liu A, Schisterman EF, & Teoh E. (2004). Sample size and power calculation in comparing diagnostic accuracy of biomarkers with pooled assessments. Journal of Applied Statistics, 31:49-59.
Liu A & Schisterman EF. (2003) Comparison of diagnostic accuracy of biomarkers with pooled assessments. Biometrical Journal, 45:631-644.