This step is to identify enriched motif in a set of probes which is carried out by function get.enriched.motif
.
This function uses a pre-calculated data Probes.motif
which was generated using HOMER with a \(p-value \le 10^{–4}\) to scan a \(\pm250bp\) region around each probe using HOmo sapiens COmprehensive MOdel COllection http://hocomoco.autosome.ru/ v10 (Kulakovskiy et al. 2016) position weight matrices (PWMs). For each probe set tested (i.e. the list of gene-linked hypomethylated probes in a given group), a motif enrichment Odds Ratio and a 95% confidence interval were calculated using following formulas: \[ p= \frac{a}{a+b} \] \[ P= \frac{c}{c+d} \] \[ Odds\quad Ratio = \frac{\frac{p}{1-p}}{\frac{P}{1-P}} \] \[ SD = \sqrt{\frac{1}{a}+\frac{1}{b}+\frac{1}{c}+\frac{1}{d}} \] \[ lower\quad boundary\quad of\quad 95\%\quad confidence\quad interval = \exp{(\ln{OR}-SD)} \]
where a
is the number of probes within the selected probe set that contain one or more motif occurrences; b
is the number of probes within the selected probe set that do not contain a motif occurrence; c
and d
are the same counts within the entire enhancer probe set. A probe set was considered significantly enriched for a particular motif if the 95% confidence interval of the Odds Ratio was greater than 1.1 (specified by option lower.OR
, 1.1 is default), and the motif occurred at least 10 times (specified by option min.incidence
. 10 is default) in the probe set. As described in the text, Odds Ratios were also used for ranking candidate motifs.
Argument | Description |
---|---|
data | A multi Assay Experiment from createMAE function. If set and probes.motif/background probes are missing this will be used to get this other two arguments correctly. This argument is not require, you can set probes.motif and the backaground.probes manually. |
probes | A vector lists the name of probes to define the set of probes in which motif enrichment OR and confidence interval will be calculated. |
lower.OR | A number specifies the smallest lower boundary of 95% confidence interval for Odds Ratio. The motif with higher lower boudnary of 95% confidence interval for Odds Ratio than the number are the significantly enriched motifs (detail see reference). |
min.incidence | A non-negative integer specifies the minimum incidence of motif in the given probes set. 10 is default. |
# Load results from previous sections
mae <- get(load("mae.rda"))
sig.diff <- read.csv("result/getMethdiff.hypo.probes.significant.csv")
pair <- read.csv("result/getPair.hypo.pairs.significant.csv")
head(pair) # significantly hypomethylated probes with putative target genes
## Probe GeneID Symbol Sides Raw.p Pe
## 1 cg00255699 ENSG00000117000 RLF R7 2.526319e-05 0.00990099
## 2 cg00255699 ENSG00000183682 BMP8A L9 1.595468e-06 0.00990099
## 3 cg00340127 ENSG00000091483 FH L7 1.460821e-04 0.00990099
## 4 cg00393673 ENSG00000143157 POGK R9 1.537896e-06 0.00990099
## 5 cg00393673 ENSG00000143179 UCK2 R6 3.212894e-11 0.00990099
## 6 cg00393673 ENSG00000143183 TMCO1 R5 5.737522e-04 0.00990099
## distNearestTSS
## 1 319786
## 2 349939
## 3 567446
## 4 1314196
## 5 302283
## 6 243590
# Identify enriched motif for significantly hypomethylated probes which
# have putative target genes.
enriched.motif <- get.enriched.motif(data = mae,
probes = pair$Probe,
dir.out = "result",
label = "hypo",
min.incidence = 10,
lower.OR = 1.1)
##
|
| | 0%
|
| | 1%
|
|= | 1%
|
|= | 2%
|
|== | 2%
|
|== | 3%
|
|== | 4%
|
|=== | 4%
|
|=== | 5%
|
|==== | 5%
|
|==== | 6%
|
|==== | 7%
|
|===== | 7%
|
|===== | 8%
|
|====== | 9%
|
|====== | 10%
|
|======= | 10%
|
|======= | 11%
|
|======== | 12%
|
|======== | 13%
|
|========= | 13%
|
|========= | 14%
|
|========= | 15%
|
|========== | 15%
|
|========== | 16%
|
|=========== | 16%
|
|=========== | 17%
|
|=========== | 18%
|
|============ | 18%
|
|============ | 19%
|
|============= | 19%
|
|============= | 20%
|
|============= | 21%
|
|============== | 21%
|
|============== | 22%
|
|=============== | 22%
|
|=============== | 23%
|
|=============== | 24%
|
|================ | 24%
|
|================ | 25%
|
|================= | 25%
|
|================= | 26%
|
|================= | 27%
|
|================== | 27%
|
|================== | 28%
|
|=================== | 29%
|
|=================== | 30%
|
|==================== | 30%
|
|==================== | 31%
|
|==================== | 32%
|
|===================== | 32%
|
|===================== | 33%
|
|====================== | 33%
|
|====================== | 34%
|
|====================== | 35%
|
|======================= | 35%
|
|======================= | 36%
|
|======================== | 36%
|
|======================== | 37%
|
|======================== | 38%
|
|========================= | 38%
|
|========================= | 39%
|
|========================== | 39%
|
|========================== | 40%
|
|========================== | 41%
|
|=========================== | 41%
|
|=========================== | 42%
|
|============================ | 42%
|
|============================ | 43%
|
|============================ | 44%
|
|============================= | 44%
|
|============================= | 45%
|
|============================== | 45%
|
|============================== | 46%
|
|============================== | 47%
|
|=============================== | 47%
|
|=============================== | 48%
|
|================================ | 49%
|
|================================ | 50%
|
|================================= | 50%
|
|================================= | 51%
|
|================================== | 52%
|
|================================== | 53%
|
|=================================== | 53%
|
|=================================== | 54%
|
|=================================== | 55%
|
|==================================== | 55%
|
|==================================== | 56%
|
|===================================== | 56%
|
|===================================== | 57%
|
|===================================== | 58%
|
|====================================== | 58%
|
|====================================== | 59%
|
|======================================= | 59%
|
|======================================= | 60%
|
|======================================= | 61%
|
|======================================== | 61%
|
|======================================== | 62%
|
|========================================= | 62%
|
|========================================= | 63%
|
|========================================= | 64%
|
|========================================== | 64%
|
|========================================== | 65%
|
|=========================================== | 65%
|
|=========================================== | 66%
|
|=========================================== | 67%
|
|============================================ | 67%
|
|============================================ | 68%
|
|============================================= | 68%
|
|============================================= | 69%
|
|============================================= | 70%
|
|============================================== | 70%
|
|============================================== | 71%
|
|=============================================== | 72%
|
|=============================================== | 73%
|
|================================================ | 73%
|
|================================================ | 74%
|
|================================================ | 75%
|
|================================================= | 75%
|
|================================================= | 76%
|
|================================================== | 76%
|
|================================================== | 77%
|
|================================================== | 78%
|
|=================================================== | 78%
|
|=================================================== | 79%
|
|==================================================== | 79%
|
|==================================================== | 80%
|
|==================================================== | 81%
|
|===================================================== | 81%
|
|===================================================== | 82%
|
|====================================================== | 82%
|
|====================================================== | 83%
|
|====================================================== | 84%
|
|======================================================= | 84%
|
|======================================================= | 85%
|
|======================================================== | 85%
|
|======================================================== | 86%
|
|======================================================== | 87%
|
|========================================================= | 87%
|
|========================================================= | 88%
|
|========================================================== | 89%
|
|========================================================== | 90%
|
|=========================================================== | 90%
|
|=========================================================== | 91%
|
|============================================================ | 92%
|
|============================================================ | 93%
|
|============================================================= | 93%
|
|============================================================= | 94%
|
|============================================================= | 95%
|
|============================================================== | 95%
|
|============================================================== | 96%
|
|=============================================================== | 96%
|
|=============================================================== | 97%
|
|=============================================================== | 98%
|
|================================================================ | 98%
|
|================================================================ | 99%
|
|=================================================================| 99%
|
|=================================================================| 100%
names(enriched.motif) # enriched motifs
## [1] "HXA9_HUMAN.H11MO.0.B" "FOS_HUMAN.H11MO.0.A"
## [3] "STAT2_HUMAN.H11MO.0.A" "ZN232_HUMAN.H11MO.0.D"
## [5] "HXC6_HUMAN.H11MO.0.D" "ZSC16_HUMAN.H11MO.0.D"
## [7] "FOXJ3_HUMAN.H11MO.1.B" "IRF9_HUMAN.H11MO.0.C"
## [9] "CDX2_HUMAN.H11MO.0.A" "IRF2_HUMAN.H11MO.0.A"
## [11] "FOSB_HUMAN.H11MO.0.A" "CPEB1_HUMAN.H11MO.0.D"
## [13] "FOSL1_HUMAN.H11MO.0.A" "JUNB_HUMAN.H11MO.0.A"
## [15] "IRF1_HUMAN.H11MO.0.A" "PO5F1_HUMAN.H11MO.1.A"
## [17] "SIX1_HUMAN.H11MO.0.A" "NF2L2_HUMAN.H11MO.0.A"
## [19] "JUND_HUMAN.H11MO.0.A" "JUN_HUMAN.H11MO.0.A"
## [21] "ZN282_HUMAN.H11MO.0.D" "MNX1_HUMAN.H11MO.0.D"
## [23] "ZBED1_HUMAN.H11MO.0.D" "STAT1_HUMAN.H11MO.1.A"
head(enriched.motif[names(enriched.motif)[1]]) ## probes in the given set that have the first motif.
## $HXA9_HUMAN.H11MO.0.B
## [1] "cg13247117" "cg18345456" "cg25210796" "cg24446417" "cg11718886"
## [6] "cg13305336" "cg24593832" "cg13067635" "cg12213388" "cg26607897"
## [11] "cg24873093" "cg25362585" "cg14094320" "cg25729466" "cg17901924"
# get.enriched.motif automatically save output files.
# getMotif.hypo.enriched.motifs.rda contains enriched motifs and the probes with the motif.
# getMotif.hypo.motif.enrichment.csv contains summary of enriched motifs.
dir(path = "result", pattern = "getMotif")
## [1] "getMotif.hypo.enriched.motifs.rda"
## [2] "getMotif.hypo.motif.enrichment.csv"
# motif enrichment figure will be automatically generated.
dir(path = "result", pattern = "motif.enrichment.pdf")
## [1] "hypo.quality.A-DS.motif.enrichment.pdf"
## [2] "hypo.quality.A-DS_with_summary.motif.enrichment.pdf"
Kulakovskiy, Ivan V, Ilya E Vorontsov, Ivan S Yevshin, Anastasiia V Soboleva, Artem S Kasianov, Haitham Ashoor, Wail Ba-Alawi, et al. 2016. “HOCOMOCO: Expansion and Enhancement of the Collection of Transcription Factor Binding Sites Models.” Nucleic Acids Research 44 (D1). Oxford Univ Press: D116–D125.