Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted.

The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov.

Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

Joint analysis of longitudinal and survival data measured on nested time-scales using shared parameter models: an application to fecundity data

Documentation

Description

The following programs and data were utilized in the manuscript "Joint analysis of longitudinal and survival data measured on nested time-scales using shared parameter models: an application to fecundity data" by Alexander C. McLain, Rajeshwari Sundaram, and Germaine M. Buck Louis.  The programs were written by A McLain.  If you have any questions regarding the programs please contact A McLain (mclaina@mailbox.sc.edu).  The analysis was originally completed in May of 2012 with R v2.15.0.

The data in this file is:

  • Oxford_train_day_level.csv and Oxford_test_day_level.csv: Day level data used to estimate parameters and prediction error, respectively.  Each row of this file corresponds to one day for one women.  The variables included are:
    • Y = Intercourse indicator.
    • ind = Woman ID number.        
    • Intercept = 1 for all rows.         
    • FemaleAge = the woman's age minus 31.         
    • Cigarettes/ Alcohol = indicator that the woman consumed at least one cigarette/alcoholic beverage on a given day.          
    • Period = indicator that menstrual bleeding occurred on a given day.
    • Weekend = indicator that the given day was a Fri/Sat/Sun.      
    • Parity = Indicator of one previous live birth.    
    • Parity*Age = interaction between parity and age.        
    • Lag = Lag 1 intercourse value.
    • Cyc-1 = cycle number minus 1.              
    • Day = day relative to ovulation (if known)        
    • day_miss = indicator that the day relative to ovulation was known.
    • RE_ind = indicator of which rows will be used for the empirical bayes estimates (for prediction only).
  • Oxford_train_women_level.csv and Oxford_test_women_level.csv: Woman level data used to estimate parameters and prediction error, respectively.  Each row of this file corresponds to one woman.  The variables included are:
    • Age = standardized value for female age.
    • Parity = Indicator of one previous live birth.
    • TTP = minimum of the woman’s time-to-pregnancy and censoring.
    • CEN = indicator of censoring.
    • Pre_ovJ = length of the follicular phase for cycle J. Set to ‘0’ if unknown and ‘-1’ if unobserved.
    • ind_po = indicator that the woman had a missing follicular phase length.

This file contains 3 programs:

  • R_programs.R = contains all of the necessary R functions.
  • Oxford.Anal.R = used to estimate the parameters (will take ~ 3 weeks to run). 
  • Oxford_prediction.R = used to implement the prediction procedure.

functions in R_code 

R_programs.R
mjm.like.aq {},  mjm.EB {}

Usage

mjm.like.aq(par,Dat,BK)  in conjunction with an optimization program, such as optim or nls.
mjm.EB(par,Dat,nodes,knots,BK): used to get empirical Bayes estimates of the random effects.

Arguments

par: The values of the following parameters
beta: a vector of length q (= dim(X)[2]) of regression coefficients to be used in the intercourse model.
sigma: log standard deviation of the baseline random effect in the intercourse model. 
sigK: log standard deviation of the peakedness random effect in the intercourse model.
sigY: log standard deviation of the baseline random effect in the pre-ovular model.
xi: vector of 2 parameters for the effect the intercourse random effects have on the TTP model.
lambda: vector of max(T) baseline intercepts (i.e., an intercept for each cycle)
betaT: a vector of length pT (i.e., dim(X_T)[2]) to be used for regression coefficients in the TTP model
phi: coefficients for the cubic B-splines function.
rho: a vector coefficients of length pk for the peakedness function.
cov12: correlation between the baseline and peakedness random effects.
mu_po,sigma_po: coefficients for the location and scale of the lognormal preovular model.
Dat: This is a list of all data.  Which includes:

$Y: a vector of intercourse indicators (length L= total number of days observed),
$ind: a vector of indicators of person number, should be consecutively numbered 1:n (length L),
$ind_po: a vector of indicators of all cycles being observed (length n=number of subjects),
$X: intercourse covariate matrix (dimension L x q),
$X_P: intercourse peakedness covariate matrix (dimension L x pk),
$day: vector of day relative to ovulation (length L, set to the day of cycle if unknown),
$day_miss: vector of indicators that the day relative to ovulation known (length L),
$T: a vector of TTP or Censoring values (length n),
$CEN: a vector of censoring indicators (length n),
$X_T: a matrix of covariates for the TTP model (dimension n x pT),
$nij: a matrix of pre-ovular lengths (dimension n x max(T), set to -1 if unobserved).
nodes: number of quadrature nodes.
knots: the values of the knots for the cubic b-spline function.
BK: boundary knots for the cubic b-spline function.

Output

The output of function mjm.like.aq(par,Dat,BK)  is  a sum(log(likelihood))

The output of function mjm.EB(par,Dat,nodes,knots,BK) is  a   n x 3 matrix of empirical Bayes Random Effects.

Details

The function mjm.like.aq(par,Dat,BK)  evaluates a joint model of intercourse and TTP.  Pre-ovular length is also modeled, since the intercourse model is a function of pre-ovular length and pre-ovular length presents informative missingness.  The likelihood is evaluated using adaptive Gaussian quadrature.  First the people with full pre-ovular information are evaluated.  Then those with missing pre-ovular information are evaluated.

The function mjm.EB(par,Dat,nodes,knots,BK) produces empirical Bayes estimates of the Random Effects.

Authors:

Alexander C. McLain, Rajeshwari Sundaram, and Germaine M. Buck Louis

mclaina@mailbox.sc.edu

Examples:

See following R-programs:

  • Oxford.Anal.R :  used to estimate the parameters (will take ~ 3 weeks to run). 
  • Oxford_prediction.R :  used to implement the prediction procedure.

Back to Joint analysis of longitudinal and survival data measured on nested time-scales using shared parameter models: an application to fecundity data