computeBs {AffiXcan} | R Documentation |
Fit a linear model to impute a GReX for a certain gene
computeBs(assocRegions, pca, expr, covariates = NULL)
assocRegions |
A data.frame with the associations between regulatory regions and one expressed gene, and with colnames = c("REGULATORY_REGION", "EXPRESSED_REGION") |
pca |
The returningObject$pca from affiXcanTrain() |
expr |
A matrix containing the real total expression values, where the columns are genes and the rows are individual IIDs |
covariates |
Optrional; a data.frame with covariates values for the population structure where the columns are the PCs and the rows are the individual IIDs; default is NULL |
A list containing three objects:
coefficients: An object containing the coefficients of the principal components used in the model, completely similar to the "coefficients" from the results of lm()
p.val: The uncorrected anova pvalue of the model
r.sq: The coefficient of determination between the real total expression values and the imputed GReX, retrived from summary(model)$r.squared
if (interactive()) { data(exprMatrix) data(trainingCovariates) data(regionAssoc) tbaPaths <- system.file("extdata","training.tba.toydata.rds", package="AffiXcan") regionsCount <- overlookRegions(tbaPaths) assay <- "values" sampleNames <- colnames(exprMatrix) nSamples <- length(sampleNames) sampGroups <- subsetKFold(k=5, n=nSamples) for (i in seq(length(sampGroups))) { sampGroups[[i]] <- colnames(exprMatrix)[sampGroups[[i]]] } testingSamples <- sampGroups[[1]] trainingSamples <- sampleNames[!sampleNames %in% testingSamples] pca <- affiXcanPca(tbaPaths=tbaPaths, varExplained=80, scale=TRUE, regionsCount=regionsCount, trainingSamples=trainingSamples) cov <- trainingCovariates cov <- cov[rownames(cov) %in% trainingSamples,] expr <- SummarizedExperiment::assays(exprMatrix)[[assay]] expr <- expr[,colnames(expr) %in% trainingSamples] expr <- t(as.data.frame(expr)) expressedRegions <- as.vector(unique(regionAssoc$EXPRESSED_REGION)) assocList <- BiocParallel::bplapply(X=expressedRegions, FUN=assoc2list, regionAssoc) names(assocList) <- expressedRegions assocRegions <- assocList[[1]] bs <- computeBs(assocRegions=assocRegions, pca=pca, expr=expr, covariates=cov) }