PMM functions:
LR{nlme}
and LRx{nlme} SRE functions: SRE{nlme} and SREx{nlme} |
R
Documentation |
|
|
||
Description
Pattern
mixture model (PMM) is used to predict a binary disease status from a longitudinal
sequence of biomarkers proposed by Liu and Albert (Biostatistics (2014) 15 (4): 706-718).
Shared random effects model (SRE) fit linear mixed-effects models proposed by
Albert (Statistics in Medicine 31(2), 143-154, 2012) for predicting binary events
from longitudinal data under random effects. PMM models only work for balanced
designs that are equal number of observations at fixed time points.
Usage
LR(Y.long,TIMEF,TIMER,DD,ID,TR)
LRx(Y.long,TIMEF,TIMER,DD,ID,TR,Xmat)
SRE(Y.long,TIMEF,TIMER,DD,ID,TR,Kp,subs)
SREx(Y.long,TIMEF,TIMER,DD,ID,TR,Kp,subs,Xmat)
Arguments
Y.long |
Vector of outcome (long format). |
|
TIMEF |
Matrix of fixed effects design
with an intercept (rows should match the length of Y.long). |
|
TIMER |
Matrix of random effects design
with an intercept (rows should match the length of Y.long). |
|
DD |
Gold standard of binary events
(short format, same length as the total number of subjects, N). |
|
ID |
Clustering variable (length should
match Y.long). |
|
TR |
Vector indicating the training set
(length should match Y.long). |
|
Kp |
Time point for prediction, i.e., c(1,4.5,4.5^2) (only used for the SRE model). |
|
subs |
Subset of time points, i.e., c(2,3,4), the last three time for prediction (only used for the SRE model). |
|
Xmat |
Matrix of adjusted factors (short
format, rows should match the length of DD). |
|
Details
The standard errors are not provided in SRE and SREx. The users could perform a nonparametric bootstrap to
obtain the standard errors of the predicted disease probabilities.
LR and SRE do not allow subject level covariates, while LRx and SREx do.
Value
LR and LRx return the predicted
probabilities p for all subjects in the test
sample using the PMM approach, as well as the standard errors for the log
odds (SE of log(p/(1-p)))
SRE and SREx return
the predicted probabilities p for all the subjects in both training and test sample, using the SRE approach.
Danping Liu
Example(s)
# A sample on a simulated
data set
set.seed(100)
N=2000
### first
M subjects as training set
M=1000
### cluster size
K=4
Kp=4.5
pD=0.5
## observation
time points
tt=rep(1:K,N)
## note that ID should be coded to
be 1 up to N
ID=rep(1:N,each=K)
tt2=tt^2
## design
matrix for fixed effects and random effects
X=Z=cbind(1,tt,tt2)
tempZ=Z[1:K,]
tempX=X[1:K,]
bt1=c(0,3,-0.4)
bt0=c(0,2.5,-0.4)
V1=diag(c(0.2,0.05,0.012))
rho12=rho13=rho23=0.4
V1[1,2]=V1[2,1]=rho12*sqrt(V1[1,1]*V1[2,2])
V1[1,3]=V1[3,1]=rho13*sqrt(V1[1,1]*V1[3,3])
V1[3,2]=V1[2,3]=rho23*sqrt(V1[3,3]*V1[2,2])
V0=diag(c(0.15,0.04,0.01))
rho12=rho13=rho23=0
V0[1,2]=V0[2,1]=rho12*sqrt(V0[1,1]*V0[2,2])
V0[1,3]=V0[3,1]=rho13*sqrt(V0[1,1]*V0[3,3])
V0[3,2]=V0[2,3]=rho23*sqrt(V0[3,3]*V0[2,2])
## generate data accodrding to PMM model
mu0=X%*%bt0
mu1=X%*%bt1
DD=rbinom(N,1,pD)
longD=rep(DD,each=K)
bb0=rmvnorm(N,sigma=V0)
bb1=rmvnorm(N,sigma=V1)
r.ef0=c(tempZ%*%t(bb0))
r.ef1=c(tempZ%*%t(bb1))
eps0=rnorm(N*K,mean=0,sd=sqrt(0.5))
eps1=rnorm(N*K,mean=0,sd=sqrt(0.45))
Y0=mu0+r.ef0+eps0
Y1=mu1+r.ef1+eps1
Y=Y0
Y[longD==0]=Y0[longD==0]
Y[longD==1]=Y1[longD==1]
wideY=t(matrix(Y,K,N))
CC=(rnorm(N))
CC.long=rep(CC,each=K)
## wide
format and long format of the data set
mydat.long=data.frame(Y,tt,CC.long,longD,ID)
mydat.wide=data.frame(wideY,CC,DD)
## 1 indicate observations in the
training set
TR=rep(0,N*K)
TR[1:(M*K)]=1
## note that DD needs to be in
short format
LR.nocov=LR(Y.long=mydat.long$Y,TIMEF=cbind(1,mydat.long$tt,mydat.long$tt^2),
TIMER=cbind(1,mydat.long$tt,mydat.long$tt^2),DD=mydat.wide$DD,ID=mydat.long$ID,TR=TR)
## note that the covariates CC and
Xmat need to be in short format
LR.cov=LRx(Y.long=mydat.long$Y,TIMEF=cbind(1,mydat.long$tt,mydat.long$tt^2),
TIMER=cbind(1,mydat.long$tt,mydat.long$tt^2),DD=mydat.wide$DD,ID=mydat.long$ID,TR=TR,
Xmat=cbind(1,mydat.wide$CC))
SRE.nocov=SRE(Y.long=mydat.long$Y,TIMEF=cbind(1,mydat.long$tt,mydat.long$tt^2),
TIMER=cbind(1,mydat.long$tt,mydat.long$tt^2),DD=mydat.wide$DD,ID=mydat.long$ID,TR=TR,Kp=c(1,Kp,Kp^2),subs=1:K)
SRE.cov=SREx(Y.long=mydat.long$Y,TIMEF=cbind(1,mydat.long$tt,mydat.long$tt^2),
TIMER=cbind(1,mydat.long$tt,mydat.long$tt^2),DD=mydat.wide$DD,ID=mydat.long$ID,TR=TR,Kp=c(1,Kp,Kp^2),subs=1:K,
Xmat=cbind(1,mydat.wide$CC))
roc.area(mydat.wide$DD[(M+1):N],LR.nocov$risk)
roc.area(mydat.wide$DD[(M+1):N],LR.cov$risk)
roc.area(mydat.wide$DD[(M+1):N],SRE.nocov$risk[(M+1):N])
roc.area(mydat.wide$DD[(M+1):N],SRE.cov$risk[(M+1):N])
References
Liu, D. and
Albert, P.S. (2014).
Combination of longitudinal biomarkers in predicting binary
events. Biostatistics 15 (4): 706-718
Albert,P.S.(2012). A linear mixed
model for predicting a binary event from longitudinal data under random effects
misspecification. Statistics in
Medicine 31(2), 143–154.