Skip Navigation
  Print Page

Biostatistics and Bioinformatics Branch (BBB)

Skip sharing on social media links
Share this:

Generalized Functional Linear Models for Gene-based Case-Control Association Studies

Ruzong Fan and Yifan Wang, NICHD/NIH, April 2014

1 Overview

This document describes a R package to implement generalized functional linear models for testing association between a dichotomous trait and multiple genetic variants in a genetic region while adjusting for covariates (Fan et al. 2014). Section 2 briefly describes the installation of the program. Section 3 explains how to run the program using one example. Section 4 offers explanation of the results and warnings to use the programs.

The theoretical basis for this program is given in our research papers in References. Please refer to the references if you use the program in any published work. In case of suggestions and questions and/or problems, you can contact us via e-mail (fanr@mail.nih.gov).

2 Download and Installation

Download R codes of “GFLM_beta_smooth_only.R” and “GFLM_fixed_model.R”, and example file of “GFLM_Example_from_SKAT.R” from GFLM_web.zip. Also, download “FLM_FPCA.R” and “FLM_FPCA_no_position.R” from FLM_web.zip, which is a R package to implement the models for functional linear models for association analysis of quantitative traits (Fan et al. 2013). Put the files in a directory you may access.

3 How to Run the Program

The package needs libraries fda, MASS, SKAT, Matrix, and globaltest in R package. Make sure to in­stall them before running our codes. Open the “GFLM Example from SKAT.R” file on an R Console in a PC window. Change the paths leading to the directory of the package “GFLM beta smooth only.R”, “GFLM fixed model.R”, “FLM FPCA.R”, and “FLM FPCA no position.R” on your computer. Then, you may run the program. The following results are based on the dataset in SKAT.

> data = data(SKAT.example)
> names(SKAT.example)
[1] "Z" "X" "y.c" "y.b"
> attach(SKAT.example)
> pheno = y.b
> geno = Z
> covariate = X
> pos = c(1:67)
> order = 4
> bbasis = 10
> gbasis = 10
> fbasis = 11
> gfasis = 11

> gflm_fixed_model(pheno, mode = "Additive", geno, pos, order, bbasis, fbasis,
gbasis, covariate, base = "bspline", interaction = FALSE)

$LRT
[1] 0.6468418
$Chisq
[1] 0.6468418
$Rao
[1] 0.648487
$gt
[1] 0.6142503

> gflm_fixed_model(pheno, mode = "Additive", geno, pos, order, bbasis, fbasis, gfasis, covariate, base = "fspline", interaction = FALSE)

$LRT
[1] 0.5484012
$Chisq
[1] 0.5484012
$Rao
[1] 0.5617545
$gt
[1] 0.8418333

> gflm_beta_smooth_only(pheno, mode = "Additive", geno, pos, order, bbasis, covariate, base = "bspline", interaction = FALSE)

[1] 0.6468418
$Chisq
[1] 0.6468418
$Rao
[1] 0.648487
$gt
[1] 0.579115

> gflm_beta_smooth_only(pheno, mode = "Additive", geno, pos, order, fbasis,
covariate, base = "fspline", interaction = FALSE)

$LRT
[1] 0.5484012
$Chisq
[1] 0.5484012
$Rao
[1] 0.5617545
$gt
[1] 0.843888

flm_fpca_no_position(pheno, mode = "Additive", geno, covariates = SKAT.example$X, kz = 20, kb = 10, smooth.cov=FALSE, family = "binomial")

$LRT
[1] 0.578574
$Chisq
[1] 0.578574
$Rao
[1] 0.5844966
$gt
[1] 0.6085071

> fpca = flm_fpca(pheno, mode = "Additive", geno, covariates = SKAT.example$X, pos,
kz = 20, kb = 10, smooth.cov=FALSE, family = "binomial")

> fpca

$LRT
[1] 0.5350435
$Chisq
[1] 0.5350435
$Rao
[1] 0.5467956
$gt
[1] 0.4799877

4 Explanation of the Results and Warnings

As shown in the Section 3, our program can output 4 p-values based on likelihood ratio test (LRT), X2, Rao’s efficient score test (Rao), and global test (gt). The LRT is the same as X2, which inflates type I error rates (Fan et al. 2014). Rao and gt have conservative and accurate type I error rates (Fan et al. 2014). If you use the R codes to analyze your data, we recommend to report the p-values of Rao and gt.

5 References

  1. Fan RZ, Wang YF, Mills JL, Wilson AF, Bailey-Wilson JE, and Xiong MM (2013) Functional linear models for association analysis of quantitative traits. Genetic Epidemiology, 37, 726-742.
  2. Fan RZ, Wang YF, Mills JL, Carter TC, Lobach I, Wilson AF, Bailey-Wilson JE, Weeks DE, and Xiong MM (2014) Generalized functional linear models for case-control association studies. Genetic Epidemiology, in revision.
Last Updated Date: 04/25/2014
Last Reviewed Date: 04/25/2014

Contact Information

Name: Dr Paul Albert
Chief and Senior Investigator
Biostatistics and Bioinformatics Branch
Phone: 301-496-5582
E-mail: albertp@mail.nih.gov

Staff Directory
Vision National Institutes of Health Home BOND National Institues of Health Home Home Storz Lab: Section on Environmental Gene Regulation Home Machner Lab: Unit on Microbial Pathogenesis Home Division of Intramural Population Health Research Home Bonifacino Lab: Section on Intracellular Protein Trafficking Home Lilly Lab: Section on Gamete Development Home Lippincott-Schwartz Lab: Section on Organelle Biology