Marginal analysis of measurement agreement among multiple raters with non-ignorable missing

Documentation

Description

The following programs and data were utilized in the article "Marginal analysis of measurement agreement among multiple raters with non-ignorable missing" by Zhen Chen and Yunlong Xie on Statistics and Its Interface 2014. The programs were written by Y Xie. If you have any questions regarding the programs please email: Yunlong.xie@nih.gov.

The programs in this file are:

visualization.R, this R code generates the clean data for analysis:
R_function_file.R, this is the R code contains all of the necessary R functions.
AnalysisProj.R, this is the R code to analysis the data.

Abstract of the article

In diagnostic medicine, several measurements have been developed to evaluate the agreements among raters when the data are complete. In practice, raters may not be able to give definitive ratings to some participants because symptoms may not be clear-cut. Simply removing subjects with missing ratings may produce biased estimates and result in loss of efficiency. In this article, we propose a within-cluster resampling (WCR) procedure and a marginal approach to handle non-ignorable missing data in measurement agreement. Simulation studies show that both WCR and marginal approach provide unbiased estimates and have coverage probabilities close to the nominal level. The proposed methods are applied to data set from the Physician Reliability Study in diagnosing endometriosis.

Important functions in R_code: R_function_file.R

FLSkp	computes Fleiss Kappa of given data set this function works for binary data only;
sim	generates simulation data;
resample	does resampling from data and compute kappa
resampleMN	does resampling from data and compute kappa
WGEE	computes estimates based on weighted GEE
EMkp	computes the empirical kappa
Delete1	is for missing completely at random (MCAR)
Delete2	is for non-ignorable 1st missing, depending on P(Y=1)
Delete3	is non-ignorable 2nd missing, depending on the variability of rating per subject
EMmissingKP	computes the empirical kappa

Useages

FLSkp{B}, sim {p,r, N, p2_1,p3_1,p4_1,p5_1,p6_1,p7_1,p8_1}, resample{A}, resampleMN{A, J},

WGEE{A}, EMkp{p,r, N, p2_1,p3_1,p4_1,R,J}, Delete1{X,Pmiss}, Delete2{X,Pmiss}, Delete3{X,a,b},

EMmissingKP{p,r, N, p2_1,p3_1,p4_1,p5_1,p6_1,p7_1,p8_1,R,J,Pmiss,a,b}

Arguments

B	the matrix containing binary ratings (0,1) with row standing for subjects and column standing for raters
A	A=B+1, It change binary labelling from (0,1) to (1,2)
r	the number of raters, default r=4
p	the marginal probability
N	the sample size
p2_1	conditional probability P(Y2=1\|Y1=1); p3_1, p4_1, and so on have the similar definition.
J	the number of times of resampling
N	the sample size
R	the number of replicates of the simulated data generated in order to compute the true Fleiss Kappa
Pmiss	the probability of assigning missing
X	simulated data
a	in non-ignorable 2nd missing situation, such as in function delete3{} and EMmissingKP{}, Pmiss=pnorm(a + b *Var(Y)), a is location parameter
b	in non-ignorable 2nd missing situation, such as in function delete3{} and EMmissingKP{}, Pmiss=pnorm(a + b *Var(Y)), b is scale parameter

Authors

Zhen Chen and Yunlong Xie
chenzhe@mail.nih.gov

Examples

Example with simulated data:

Function sim{}:

set.seed(1)
A1=sim(0.7,4, 10000,0.8,0.9)
A1$cor12
A1$EMcor12
A1$cor13
A1$EMcor13
A1$cor23
A1$EMcor23

colSums(A1$Data)/nrow(A1$Data)

Function EMkp {}:

completeKP=EMkp(0.7,4, 100,0.8,0.9,0.85,1000, 1000)

Function EMmissingKP{}:
p=as.double(commandArgs()[3])#marginal probability
r=as.double(commandArgs()[4])#number of raters
N=as.double(commandArgs()[5])#sample size
p2_1=as.double(commandArgs()[6])#P(Y2=1|Y1=1)
p3_1=as.double(commandArgs()[7])#P(Y3=1|Y1=1)
p4_1=as.double(commandArgs()[8])#P(Y4=1|Y1=1)
p5_1=as.double(commandArgs()[9])#P(Y5=1|Y1=1)
p6_1=as.double(commandArgs()[10])#P(Y6=1|Y1=1)
p7_1=as.double(commandArgs()[11])#P(Y7=1|Y1=1)
p8_1=as.double(commandArgs()[12])#P(Y8=1|Y1=1)

R=as.double(commandArgs()[13])#the number of replicates of the simulated data generated
J=as.double(commandArgs()[14])#the number of times of resampling
Pmiss=as.double(commandArgs()[15])#probability of assigning missingness
b=as.double(commandArgs()[16])#parameter determining probability of assigning missingness
a=-4

set.seed(1)
SimResult=EMmissingKP(p,r, N, p2_1,p3_1,p4_1,p5_1,p6_1,p7_1,p8_1,R,J,Pmiss,a,b)
FLS=round(sim(p,r, N, p2_1,p3_1,p4_1,p5_1,p6_1,p7_1,p8_1)$FLS,2)
Example with analysis data set:

(see AnalysisProj.R )

Back to Marginal analysis of measurement agreement among multiple raters with non-ignorable missing