PMM functions: LR{nlme} and LRx{nlme}

SRE functions: SRE{nlme} and SREx{nlme}

R Documentation

 

Pattern Mixture Model and Shared Random Effects Model

Description

Pattern mixture model (PMM) is used to predict a binary disease status from a longitudinal sequence of biomarkers proposed by Liu and Albert (Biostatistics (2014) 15 (4): 706-718). Shared random effects model (SRE) fit linear mixed-effects models proposed by Albert (Statistics in Medicine 31(2), 143-154, 2012) for predicting binary events from longitudinal data under random effects. PMM models only work for balanced designs that are equal number of observations at fixed time points. 

Usage

LR(Y.long,TIMEF,TIMER,DD,ID,TR)

LRx(Y.long,TIMEF,TIMER,DD,ID,TR,Xmat)

 

SRE(Y.long,TIMEF,TIMER,DD,ID,TR,Kp,subs)

SREx(Y.long,TIMEF,TIMER,DD,ID,TR,Kp,subs,Xmat)

Arguments

Y.long

Vector of outcome (long format).

TIMEF

Matrix of fixed effects design with an intercept (rows should match the length of Y.long).

 

TIMER

Matrix of random effects design with an intercept (rows should match the length of Y.long).

 

DD

Gold standard of binary events (short format, same length as the total number of subjects, N).

 

ID

Clustering variable (length should match Y.long).

 

TR

Vector indicating the training set (length should match Y.long).

 

Kp

Time point for prediction, i.e., c(1,4.5,4.5^2) (only used for the SRE model).

 

subs

Subset of time points, i.e., c(2,3,4), the last three time for prediction  (only used for the SRE model).

 

Xmat

Matrix of adjusted factors (short format, rows should match the length of DD).

 

Details

The standard errors are not provided in SRE and SREx. The users could perform a nonparametric bootstrap to obtain the standard errors of the predicted disease probabilities.

LR and SRE do not allow subject level covariates, while LRx and SREx do.

Value

LR and LRx return the predicted probabilities p for all subjects in the test sample using the PMM approach, as well as the standard errors for the log odds (SE of log(p/(1-p)))

SRE and SREx return the predicted probabilities p for all the subjects in both training and test sample, using the SRE approach.

Author(s)

Danping Liu

danping.liu@nih.gov

Example(s)

# A sample on a simulated data set

 

set.seed(100)

N=2000

 

### first M subjects as training set

M=1000

 

### cluster size

K=4

Kp=4.5

pD=0.5

 

## observation time points

tt=rep(1:K,N)

 

## note that ID should be coded to be 1 up to N

ID=rep(1:N,each=K)

tt2=tt^2

 

## design matrix for fixed effects and random effects

X=Z=cbind(1,tt,tt2)

tempZ=Z[1:K,]

tempX=X[1:K,]

bt1=c(0,3,-0.4)

bt0=c(0,2.5,-0.4)

V1=diag(c(0.2,0.05,0.012))

rho12=rho13=rho23=0.4

V1[1,2]=V1[2,1]=rho12*sqrt(V1[1,1]*V1[2,2])

V1[1,3]=V1[3,1]=rho13*sqrt(V1[1,1]*V1[3,3])

V1[3,2]=V1[2,3]=rho23*sqrt(V1[3,3]*V1[2,2])

V0=diag(c(0.15,0.04,0.01))

rho12=rho13=rho23=0

V0[1,2]=V0[2,1]=rho12*sqrt(V0[1,1]*V0[2,2])

V0[1,3]=V0[3,1]=rho13*sqrt(V0[1,1]*V0[3,3])

V0[3,2]=V0[2,3]=rho23*sqrt(V0[3,3]*V0[2,2])

 

## generate data accodrding to PMM model

mu0=X%*%bt0

mu1=X%*%bt1

DD=rbinom(N,1,pD)

longD=rep(DD,each=K)

 

bb0=rmvnorm(N,sigma=V0)

bb1=rmvnorm(N,sigma=V1)

r.ef0=c(tempZ%*%t(bb0))

r.ef1=c(tempZ%*%t(bb1))

 

eps0=rnorm(N*K,mean=0,sd=sqrt(0.5))

eps1=rnorm(N*K,mean=0,sd=sqrt(0.45))

Y0=mu0+r.ef0+eps0

Y1=mu1+r.ef1+eps1

Y=Y0

Y[longD==0]=Y0[longD==0]

Y[longD==1]=Y1[longD==1]

wideY=t(matrix(Y,K,N))

CC=(rnorm(N))

CC.long=rep(CC,each=K)

 

## wide format and long format of the data set

mydat.long=data.frame(Y,tt,CC.long,longD,ID)

mydat.wide=data.frame(wideY,CC,DD)

 

## 1 indicate observations in the training set

TR=rep(0,N*K)

TR[1:(M*K)]=1

 

## note that DD needs to be in short format

LR.nocov=LR(Y.long=mydat.long$Y,TIMEF=cbind(1,mydat.long$tt,mydat.long$tt^2),

TIMER=cbind(1,mydat.long$tt,mydat.long$tt^2),DD=mydat.wide$DD,ID=mydat.long$ID,TR=TR)

 

## note that the covariates CC and Xmat need to be in short format

LR.cov=LRx(Y.long=mydat.long$Y,TIMEF=cbind(1,mydat.long$tt,mydat.long$tt^2),

TIMER=cbind(1,mydat.long$tt,mydat.long$tt^2),DD=mydat.wide$DD,ID=mydat.long$ID,TR=TR,

Xmat=cbind(1,mydat.wide$CC))

 

SRE.nocov=SRE(Y.long=mydat.long$Y,TIMEF=cbind(1,mydat.long$tt,mydat.long$tt^2),

TIMER=cbind(1,mydat.long$tt,mydat.long$tt^2),DD=mydat.wide$DD,ID=mydat.long$ID,TR=TR,Kp=c(1,Kp,Kp^2),subs=1:K)

 

SRE.cov=SREx(Y.long=mydat.long$Y,TIMEF=cbind(1,mydat.long$tt,mydat.long$tt^2),

TIMER=cbind(1,mydat.long$tt,mydat.long$tt^2),DD=mydat.wide$DD,ID=mydat.long$ID,TR=TR,Kp=c(1,Kp,Kp^2),subs=1:K,

Xmat=cbind(1,mydat.wide$CC))

 

roc.area(mydat.wide$DD[(M+1):N],LR.nocov$risk)

roc.area(mydat.wide$DD[(M+1):N],LR.cov$risk)

roc.area(mydat.wide$DD[(M+1):N],SRE.nocov$risk[(M+1):N])

roc.area(mydat.wide$DD[(M+1):N],SRE.cov$risk[(M+1):N])

References

Liu, D. and Albert, P.S. (2014). Combination of longitudinal biomarkers in predicting binary events. Biostatistics 15 (4): 706-718

Albert,P.S.(2012). A linear mixed model for predicting a binary event from longitudinal data under random effects misspecification. Statistics in Medicine 31(2), 143–154.