trainingCovariates {AffiXcan} | R Documentation |
Toy data used in examples to describe affiXcanTrain() function.
data(trainingCovariates)
An object of class data.frame
This object consists in a data.frame where columns are the first three principal components of the population genetic structure and rows are individuals' IDs. These individuals are the same whom expression values are stored in the expression matrix (see help(exprMatrix) )
Genotypes of the individuals were downloaded from the GEUVADIS public dataset (https://www.ebi.ac.uk/arrayexpress/files/E-GEUV-1/) in vcf format. Following L. Price et al. (https://www.sciencedirect.com/science/article/pii/S0002929708003534), long range linkage disequilibrium (LRLD) regions were first filtered out with vcf-tools. Then, following J. Novembre et al. (www.nature.com/articles/nature07331), non-common alleles (MAF < 0.05) were filtered out with vcftools and LD pruning was performed with plink. Finally, principal components were computed with eigenstrat.
data(trainingCovariates) head(trainingCovariates)