getMappedEntrezIDs {missMethyl} | R Documentation |
Given a set of CpG probe names and optionally all the CpG sites tested, this function outputs a list containing the mapped Entrez Gene IDs as well as the numbers of probes per gene, and a vector indicating significance.
getMappedEntrezIDs( sig.cpg, all.cpg = NULL, array.type = c("450K", "EPIC"), anno = NULL, genomic.features = c("ALL", "TSS200", "TSS1500", "Body", "1stExon", "3'UTR", "5'UTR", "ExonBnd") )
sig.cpg |
Character vector of significant CpG sites used for testing gene set enrichment. |
all.cpg |
Character vector of all CpG sites tested. Defaults to all CpG sites on the array. |
array.type |
The Illumina methylation array used. Options are "450K" or "EPIC". |
anno |
Optional. A |
genomic.features |
Character vector or scalar indicating whether the gene set enrichment analysis should be restricted to CpGs from specific genomic locations. Options are "ALL", "TSS200","TSS1500","Body","1stExon", "3'UTR","5'UTR","ExonBnd"; and the user can select any combination. Defaults to "ALL". |
This function is used by the gene set testing functions gometh
and
gsameth
. It maps the significant CpG probe names to Entrez Gene IDs,
as well as all the CpG sites tested. It also calculates the numbers of
probes for gene. Input CpGs are able to be restricted by genomic features
using the genomic.features
argument.
Genes associated with each CpG site are obtained from the annotation package
IlluminaHumanMethylation450kanno.ilmn12.hg19
if the array type is
"450K". For the EPIC array, the annotation package
IlluminaHumanMethylationEPICanno.ilm10b4.hg19
is used. To use a
different annotation package, please supply it using the anno
argument.
A list with the following elements
sig.eg |
mapped Entrez Gene IDs for the significant probes |
universe |
mapped Entrez Gene IDs for all probes on the array, or for all the CpG probes tested. |
freq |
table output with numbers of probes associated with each gene |
equiv |
table output with equivalent numbers of probes associated with each gene taking into account multi-gene bias |
de |
a vector of ones and zeroes of the same length of universe indicating which genes in the universe are significantly differentially methylated. |
fract.counts |
a dataframe with 2 columns corresponding to the Entrez Gene IDS for the significant probes and the associated weight to account for multi-gene probes. |
Belinda Phipson
## Not run: # to avoid timeout on Bioconductor build library(IlluminaHumanMethylation450kanno.ilmn12.hg19) library(org.Hs.eg.db) library(limma) ann <- getAnnotation(IlluminaHumanMethylation450kanno.ilmn12.hg19) # Randomly select 1000 CpGs to be significantly differentially methylated sigcpgs <- sample(rownames(ann),1000,replace=FALSE) # All CpG sites tested allcpgs <- rownames(ann) mappedEz <- getMappedEntrezIDs(sigcpgs,allcpgs,array.type="450K") names(mappedEz) # Entrez IDs of the significant genes mappedEz$sig.eg[1:10] # Entrez IDs for the universe mappedEz$universe[1:10] # Number of CpGs per gene mappedEz$freq[1:10] # Equivalent numbers of CpGs measured per gene mappedEz$equiv[1:10] A vector of 0s and 1s indicating which genes in the universe are significant mappedEz$de[1:10] ## End(Not run)