Citation: if you use MAGeCKFlute in published research, please cite: Binbin Wang, Mei Wang, Wubing Zhang. “Integrative analysis of pooled CRISPR genetic screens using MAGeCKFlute.” Nature Protocols (2019), doi: 10.1038/s41596-018-0113-7.t
library(MAGeCKFlute)
file1 = file.path(system.file("extdata", package = "MAGeCKFlute"),
"testdata/rra.gene_summary.txt")
gdata = ReadRRA(file1)
genelist = gdata$Score
names(genelist) = gdata$id
genelist[1:5]
## CREBBP EP300 CHD C16orf72 CACNB2
## 0.96608 1.02780 0.59265 0.82307 0.39268
MAGeCKFlute incorporates three enrichment methods, including Over-Representation Test (ORT), Gene Set Enrichment Analysis (GSEA), and Hypergeometric test (HGT). Here, ORT and GSEA are borrowed from R package clusterProfiler (Yu et al. 2012).
# Alternative functions EnrichAnalyzer and enrich.HGT.
hgtRes1 = EnrichAnalyzer(genelist[genelist< -1], method = "HGT")
head(hgtRes1@result)
## ID
## REACTOME_2467813 REACTOME_2467813
## REACTOME_72163 REACTOME_72163
## GO:0006364 GO:0006364
## GO:1901990 GO:1901990
## GO:0031145 GO:0031145
## REACTOME_68949 REACTOME_68949
## Description
## REACTOME_2467813 Separation of Sister Chromatids
## REACTOME_72163 mRNA Splicing - Major Pathway
## GO:0006364 rRNA processing
## GO:1901990 regulation of mitotic cell cycle phase transition
## GO:0031145 anaphase-promoting complex-dependent catabolic process
## REACTOME_68949 Orc1 removal from chromatin
## NES pvalue p.adjust GeneRatio BgRatio
## REACTOME_2467813 -10.350541 1.109937e-14 2.239853e-11 24/292 191/16544
## REACTOME_72163 -10.123151 3.683095e-14 3.716243e-11 23/292 183/16544
## GO:0006364 -8.586255 1.556246e-12 1.046835e-09 19/292 143/16544
## GO:1901990 -7.571253 1.344431e-11 6.782654e-09 14/292 80/16544
## GO:0031145 -7.591227 2.346509e-11 9.470511e-09 14/292 83/16544
## REACTOME_68949 -7.619360 3.195559e-11 1.074773e-08 13/292 71/16544
## geneID
## REACTOME_2467813 5518/5686/5688/5689/5691/5692/5695/5702/5708/5347/29945/5885/701/8243/8697/9184/1778/10403/11243/147841/25936/54820/81930/9212
## REACTOME_72163 9775/1479/29894/23517/25804/5433/5435/10262/10523/10772/1665/23020/27339/3192/51340/56949/6426/6632/8175/1660/23398/54883/7536
## GO:0006364 2091/23160/51367/55127/92856/6187/6209/9775/23517/5394/54512/56915/10969/23481/54555/81887/9136/54853/55661
## GO:1901990 2068/5686/5688/5689/5691/5692/5695/5702/5708/5347/29945/701/8697/9184
## GO:0031145 5686/5688/5689/5691/5692/5695/5702/5708/5347/29945/701/8697/9184/9212
## REACTOME_68949 4171/4175/5686/5688/5689/5691/5692/5695/5702/5708/9978/23594/890
## geneName
## REACTOME_2467813 PPP2R1A/PSMA5/PSMA7/PSMB1/PSMB3/PSMB4/PSMB7/PSMC3/PSMD2/PLK1/ANAPC4/RAD21/BUB1B/SMC1A/CDC23/BUB3/DYNC1H1/NDC80/PMF1/SPC24/NSL1/NDE/KIF18A/AURKB
## REACTOME_72163 EIF4A3/CSTF3/CPSF1/SKIV2L2/LSM4/POLR2D/POLR2F/SF3B4/CHERP/SRSF10/DHX15/SNRNP200/PRPF19/HNRNPU/CRNKL1/XAB2/SRSF1/SNRPD/SF3A2/DHX9/PPWD1/CWC25/SF1
## GO:0006364 FBL/WDR43/POP5/HEATR1/IMP4/RPS2/RPS15/EIF4A3/SKIV2L2/EXOSC10/EXOSC4/EXOSC5/EBNA1BP2/PES1/DDX49/LAS1L/RRP9/WDR55/DDX27
## GO:1901990 ERCC2/PSMA5/PSMA7/PSMB1/PSMB3/PSMB4/PSMB7/PSMC3/PSMD2/PLK1/ANAPC4/BUB1B/CDC23/BUB3
## GO:0031145 PSMA5/PSMA7/PSMB1/PSMB3/PSMB4/PSMB7/PSMC3/PSMD2/PLK1/ANAPC4/BUB1B/CDC23/BUB3/AURKB
## REACTOME_68949 MCM2/MCM6/PSMA5/PSMA7/PSMB1/PSMB3/PSMB4/PSMB7/PSMC3/PSMD2/RBX1/ORC6/CCNA2
## Count
## REACTOME_2467813 24
## REACTOME_72163 23
## GO:0006364 19
## GO:1901990 14
## GO:0031145 14
## REACTOME_68949 13
# hgtRes2 = enrich.HGT(genelist[genelist< -1])
# head(hgtRes2@result)
# Alternative functions EnrichAnalyzer and enrich.ORT.
ortRes1 = EnrichAnalyzer(genelist[genelist< -1], method = "ORT")
head(ortRes1@result)
## ID
## REACTOME_2467813 REACTOME_2467813
## REACTOME_72163 REACTOME_72163
## GO:0006364 GO:0006364
## GO:1901990 GO:1901990
## GO:0031145 GO:0031145
## REACTOME_68949 REACTOME_68949
## Description
## REACTOME_2467813 Separation of Sister Chromatids
## REACTOME_72163 mRNA Splicing - Major Pathway
## GO:0006364 rRNA processing
## GO:1901990 regulation of mitotic cell cycle phase transition
## GO:0031145 anaphase-promoting complex-dependent catabolic process
## REACTOME_68949 Orc1 removal from chromatin
## NES pvalue p.adjust GeneRatio BgRatio
## REACTOME_2467813 -10.350541 4.600475e-14 9.283758e-11 24/292 191/16544
## REACTOME_72163 -10.123151 1.575004e-13 1.589179e-10 23/292 183/16544
## GO:0006364 -8.586255 7.998992e-12 5.380656e-09 19/292 143/16544
## GO:1901990 -7.571253 1.107947e-10 5.589591e-08 14/292 80/16544
## GO:0031145 -7.591227 1.853006e-10 7.478733e-08 14/292 83/16544
## REACTOME_68949 -7.619360 2.875071e-10 9.669822e-08 13/292 71/16544
## geneID
## REACTOME_2467813 29945/5708/54820/5689/5695/5518/701/25936/5688/8697/81930/5702/5692/5347/10403/5691/147841/9212/11243/5885/1778/8243/5686/9184
## REACTOME_72163 6426/1479/56949/5435/6632/27339/23398/7536/54883/29894/5433/51340/10523/8175/10262/9775/1660/1665/3192/25804/10772/23020/23517
## GO:0006364 23481/10969/2091/54555/23160/9136/51367/6187/55661/55127/6209/56915/54512/9775/54853/81887/5394/92856/23517
## GO:1901990 2068/29945/5708/5689/5695/701/5688/8697/5702/5692/5347/5691/5686/9184
## GO:0031145 29945/5708/5689/5695/701/5688/8697/5702/5692/5347/5691/9212/5686/9184
## REACTOME_68949 5708/5689/5695/5688/4175/5702/5692/5691/4171/890/9978/5686/23594
## geneName
## REACTOME_2467813 ANAPC4/PSMD2/NDE1/PSMB1/PSMB7/PPP2R1A/BUB1B/NSL1/PSMA7/CDC23/KIF18A/PSMC3/PSMB4/PLK1/NDC80/PSMB3/SPC24/AURKB/PMF1/RAD21/DYNC1H1/SMC1A/PSMA5/BUB3
## REACTOME_72163 SRSF1/CSTF3/XAB2/POLR2F/SNRPD1/PRPF19/PPWD1/SF1/CWC25/CPSF1/POLR2D/CRNKL1/CHERP/SF3A2/SF3B4/EIF4A3/DHX9/DHX15/HNRNPU/LSM4/SRSF10/SNRNP200/MTREX
## GO:0006364 PES1/EBNA1BP2/FBL/DDX49/WDR43/RRP9/POP5/RPS2/DDX27/HEATR1/RPS15/EXOSC5/EXOSC4/EIF4A3/WDR55/LAS1L/EXOSC10/IMP4/MTREX
## GO:1901990 ERCC2/ANAPC4/PSMD2/PSMB1/PSMB7/BUB1B/PSMA7/CDC23/PSMC3/PSMB4/PLK1/PSMB3/PSMA5/BUB3
## GO:0031145 ANAPC4/PSMD2/PSMB1/PSMB7/BUB1B/PSMA7/CDC23/PSMC3/PSMB4/PLK1/PSMB3/AURKB/PSMA5/BUB3
## REACTOME_68949 PSMD2/PSMB1/PSMB7/PSMA7/MCM6/PSMC3/PSMB4/PSMB3/MCM2/CCNA2/RBX1/PSMA5/ORC6
## Count
## REACTOME_2467813 24
## REACTOME_72163 23
## GO:0006364 19
## GO:1901990 14
## GO:0031145 14
## REACTOME_68949 13
# ortRes2 = enrich.ORT(genelist[genelist< -1])
# head(ortRes2@result)
# Alternative functions EnrichAnalyzer and enrich.GSE.
gseRes1 = EnrichAnalyzer(genelist, method = "GSEA")
## Warning in preparePathwaysAndStats(pathways, stats, minSize, maxSize, gseaParam, : There are ties in the preranked stats (2.63% of the list).
## The order of those tied genes will be arbitrary, which may produce unexpected results.
head(gseRes1@result)
## ID Description
## REACTOME_2467813 REACTOME_2467813 Separation of Sister Chromatids
## REACTOME_72163 REACTOME_72163 mRNA Splicing - Major Pathway
## REACTOME_2500257 REACTOME_2500257 Resolution of Sister Chromatid Cohesion
## KEGG_hsa03013 KEGG_hsa03013 RNA transport
## GO:0006364 GO:0006364 rRNA processing
## KEGG_hsa03040 KEGG_hsa03040 Spliceosome
## NES pvalue p.adjust
## REACTOME_2467813 -2.346387 1.642690e-18 1.474808e-14
## REACTOME_72163 -2.292184 7.132318e-16 3.201697e-12
## REACTOME_2500257 -2.243006 6.751610e-13 2.020532e-09
## KEGG_hsa03013 -2.230305 1.554368e-12 3.488779e-09
## GO:0006364 -2.260766 2.482218e-12 4.457071e-09
## KEGG_hsa03040 -2.216309 4.950149e-12 7.407073e-09
## geneID
## REACTOME_2467813 5719/23244/9735/11130/54821/55746/5709/79019/6396/23047/10197/1062/1063/57405/3796/113130/10274/5690/11004/9861/5905/701/5518/5702/8697/54820/9184/5686/81930/5885/29945/1778/5692/5695/5708/147841/25936/5688/8243/9212/10403/5689/5347/11243/5691
## REACTOME_72163 8449/6432/1655/9410/9092/25949/22827/55749/79869/1994/6628/6434/10450/23524/5432/5356/9939/9129/9785/51690/1477/10465/24148/26121/10250/8175/23398/51340/54883/5435/10262/10523/6426/9775/29894/1665/23020/1660/23517/25804/27339/7536/10772/1479/3192/5433/6632/56949
## REACTOME_2500257 23244/9735/11130/54821/55746/79019/6396/23047/1062/1063/57405/3796/113130/10274/11004/5905/701/5518/54820/9184/81930/5885/1778/147841/25936/8243/9212/10403/5347/11243
## KEGG_hsa03013 9972/55746/11097/6396/2733/9669/9939/25929/1975/11218/5976/10799/8891/5905/10250/9984/79897/9775/8892/51367/80145/8890/3837/60528/79833/8661/1983/1964/4927
## GO:0006364 79863/23246/317781/51202/79039/705/28987/55759/27042/29960/88745/22984/51118/10171/79707/10521/65083/54555/56915/6209/23160/2091/55661/9775/10969/5394/6187/51367/23481/54853/23517/92856/54512/9136/55127/81887
## KEGG_hsa03040 8449/6432/1655/9410/9092/25949/22827/153527/8559/6628/6434/10450/5356/9939/9129/9785/51690/10465/24148/26121/8175/51340/10262/10523/6426/9984/9775/1665/23020/1659/25804/27339/10772/3192/6632/56949
## geneName
## REACTOME_2467813 PSMD13/PDS5A/KNTC1/ZWINT/ERCC6L/NUP133/PSMD3/CENPM/SEC13/PDS5B/PSME3/CENPE/CENPF/SPC25/KIF2A/CDCA5/STAG1/PSMB2/KIF2C/PSMD6/RANGAP1/BUB1B/PPP2R1A/PSMC3/CDC23/NDE1/BUB3/PSMA5/KIF18A/RAD21/ANAPC4/DYNC1H1/PSMB4/PSMB7/PSMD2/SPC24/NSL1/PSMA7/SMC1A/AURKB/NDC80/PSMB1/PLK1/PMF1/PSMB3
## REACTOME_72163 DHX16/SRSF7/DDX5/SNRNP40/SART1/SYF2/PUF60/CCAR1/CPSF7/ELAVL1/SNRPB/TRA2B/PPIE/SRRM2/POLR2C/PLRG1/RBM8A/PRPF3/DHX38/LSM7/CSTF1/PPIH/PRPF6/PRPF31/SRRM1/SF3A2/PPWD1/CRNKL1/CWC25/POLR2F/SF3B4/CHERP/SRSF1/EIF4A3/CPSF1/DHX15/SNRNP200/DHX9/MTREX/LSM4/PRPF19/SF1/SRSF10/CSTF3/HNRNPU/POLR2D/SNRPD1/XAB2
## REACTOME_2500257 PDS5A/KNTC1/ZWINT/ERCC6L/NUP133/CENPM/SEC13/PDS5B/CENPE/CENPF/SPC25/KIF2A/CDCA5/STAG1/KIF2C/RANGAP1/BUB1B/PPP2R1A/NDE1/BUB3/KIF18A/RAD21/DYNC1H1/SPC24/NSL1/SMC1A/AURKB/NDC80/PLK1/PMF1
## KEGG_hsa03013 NUP153/NUP133/NUP42/SEC13/GLE1/EIF5B/RBM8A/GEMIN5/EIF4B/DDX20/UPF1/RPP40/EIF2B3/RANGAP1/SRRM1/THOC1/RPP21/EIF4A3/EIF2B2/POP5/THOC7/EIF2B4/KPNB1/ELAC2/GEMIN6/EIF3A/EIF5/EIF1AX/NUP88
## GO:0006364 RBFA/BOP1/DDX51/DDX47/DDX54/BYSL/NOB1/WDR12/UTP25/MRM2/RRP36/PDCD11/UTP11/RCL1/NOL9/DDX17/NOL6/DDX49/EXOSC5/RPS15/WDR43/FBL/DDX27/EIF4A3/EBNA1BP2/EXOSC10/RPS2/POP5/PES1/WDR55/MTREX/IMP4/EXOSC4/RRP9/HEATR1/LAS1L
## KEGG_hsa03040 DHX16/SRSF7/DDX5/SNRNP40/SART1/SYF2/PUF60/ZMAT2/PRPF18/SNRPB/TRA2B/PPIE/PLRG1/RBM8A/PRPF3/DHX38/LSM7/PPIH/PRPF6/PRPF31/SF3A2/CRNKL1/SF3B4/CHERP/SRSF1/THOC1/EIF4A3/DHX15/SNRNP200/DHX8/LSM4/PRPF19/SRSF10/HNRNPU/SNRPD1/XAB2
## Count
## REACTOME_2467813 45
## REACTOME_72163 48
## REACTOME_2500257 30
## KEGG_hsa03013 29
## GO:0006364 36
## KEGG_hsa03040 36
# gseRes2 = enrich.GSE(genelist)
# head(gseRes2@result)
require(ggplot2)
df = hgtRes1@result
df$logFDR = -log10(df$p.adjust)
p = BarView(df[1:5,], "Description", 'logFDR')
p = p + labs(x = NULL) + coord_flip()
p
# Or use function barplot from enrichplot package
barplot(hgtRes1, showCategory = 5)
## top: up-regulated pathways;
## bottom: down-regulated pathways
EnrichedView(hgtRes1, top = 0, bottom = 5, mode = 1)
EnrichedView(hgtRes1, top = 0, bottom = 5, mode = 2)
dotplot(hgtRes1, showCategory = 5)
hgtRes1@result$geneID = hgtRes1@result$geneName
cnetplot(hgtRes1, 2)
heatplot(hgtRes1, showCategory = 3, foldChange=genelist)
emapplot(hgtRes1, layout="kk")
#gseaplot
gseaplot(gseRes1, geneSetID = 1, title = gseRes1$Description[1])
gseaplot(gseRes1, geneSetID = 1, by = "runningScore", title = gseRes1$Description[1])
gseaplot(gseRes1, geneSetID = 1, by = "preranked", title = gseRes1$Description[1])
#or
gseaplot2(gseRes1, geneSetID = 1:3)
For enrichment analysis, MAGeCKFlute signifies the public available gene sets, including Pathways (PID, KEGG, REACTOME, BIOCARTA, C2CP), GO terms (GOBP, GOCC, GOMF), Complexes (CORUM) and molecular signature from MsigDB (c1, c2, c3, c4, c5, c6, c7, HALLMARK).
Analysis of high-throughput data increasingly relies on pathway annotation and functional information derived from Gene Ontology, which is also useful in the analysis of CRISPR screens.
## KEGG and REACTOME pathways
enrich = EnrichAnalyzer(geneList = genelist[genelist< -1], type = "KEGG+REACTOME")
EnrichedView(enrich, bottom = 5)
## Only KEGG pathways
enrich = EnrichAnalyzer(geneList = genelist[genelist< -1], type = "KEGG")
EnrichedView(enrich, bottom = 5)
## Gene ontology
enrichGo = EnrichAnalyzer(genelist[genelist< -1], type = "GOBP+GOMF")
EnrichedView(enrichGo, bottom = 5)
Functional annotations from the pathways and GO are powerful in the context of network dynamics. However, the approach has limitations in particular for the analysis of CRISPR screenings, in which elements within a protein complex rather than complete pathways might have a strong selection. So we incorporate protein complex resource from CORUM database, which enable identification of essential protein complexes from the CRISPR screens.
enrichPro = EnrichAnalyzer(genelist[genelist< -1], type = "CORUM")
EnrichedView(enrichPro, bottom = 5)
enrichComb = EnrichAnalyzer(genelist[genelist< -1], type = "GOBP+KEGG")
EnrichedView(enrichComb, bottom = 5)
enrich = EnrichAnalyzer(genelist[genelist< -1], type = "GOBP", limit = c(50, 500))
EnrichedView(enrich, bottom = 5)
EnrichedFilter
.enrich1 = EnrichAnalyzer(genelist[genelist< -1], type = "GOMF+GOBP")
enrich2 = EnrichAnalyzer(genelist[genelist< -1], type = "GOMF+GOBP", filter = TRUE)
enrich3 = EnrichedFilter(enrich1)
EnrichedView(enrich1, bottom = 15)
EnrichedView(enrich2, bottom = 15)
EnrichedView(enrich3, bottom = 15)
sessionInfo()
## R version 4.0.3 (2020-10-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.5 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.12-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.12-bioc/R/lib/libRlapack.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] ggplot2_3.3.2 MAGeCKFlute_1.10.0 BiocStyle_2.18.0
##
## loaded via a namespace (and not attached):
## [1] nlme_3.1-150 matrixStats_0.57.0 enrichplot_1.10.0
## [4] bit64_4.0.5 httr_1.4.2 RColorBrewer_1.1-2
## [7] tools_4.0.3 R6_2.4.1 mgcv_1.8-33
## [10] DBI_1.1.0 BiocGenerics_0.36.0 colorspace_1.4-1
## [13] withr_2.3.0 tidyselect_1.1.0 gridExtra_2.3
## [16] bit_4.0.4 compiler_4.0.3 Biobase_2.50.0
## [19] scatterpie_0.1.5 labeling_0.4.2 bookdown_0.21
## [22] shadowtext_0.0.7 scales_1.1.1 genefilter_1.72.0
## [25] stringr_1.4.0 digest_0.6.27 rmarkdown_2.5
## [28] DOSE_3.16.0 pkgconfig_2.0.3 htmltools_0.5.0
## [31] limma_3.46.0 rlang_0.4.8 RSQLite_2.2.1
## [34] farver_2.0.3 generics_0.0.2 BiocParallel_1.24.0
## [37] GOSemSim_2.16.0 dplyr_1.0.2 magrittr_1.5
## [40] GO.db_3.12.0 Matrix_1.2-18 Rcpp_1.0.5
## [43] munsell_0.5.0 S4Vectors_0.28.0 viridis_0.5.1
## [46] lifecycle_0.2.0 edgeR_3.32.0 stringi_1.5.3
## [49] yaml_2.2.1 ggraph_2.0.3 MASS_7.3-53
## [52] plyr_1.8.6 qvalue_2.22.0 grid_4.0.3
## [55] blob_1.2.1 parallel_4.0.3 ggrepel_0.8.2
## [58] DO.db_2.9 crayon_1.3.4 lattice_0.20-41
## [61] msigdbr_7.2.1 graphlayouts_0.7.1 cowplot_1.1.0
## [64] splines_4.0.3 annotate_1.68.0 locfit_1.5-9.4
## [67] magick_2.5.0 knitr_1.30 pillar_1.4.6
## [70] fgsea_1.16.0 igraph_1.2.6 reshape2_1.4.4
## [73] codetools_0.2-16 stats4_4.0.3 fastmatch_1.1-0
## [76] XML_3.99-0.5 glue_1.4.2 evaluate_0.14
## [79] downloader_0.4 data.table_1.13.2 BiocManager_1.30.10
## [82] vctrs_0.3.4 tweenr_1.0.1 gtable_0.3.0
## [85] purrr_0.3.4 polyclip_1.10-0 tidyr_1.1.2
## [88] xfun_0.18 ggforce_0.3.2 xtable_1.8-4
## [91] tidygraph_1.2.0 survival_3.2-7 viridisLite_0.3.0
## [94] tibble_3.0.4 pheatmap_1.0.12 clusterProfiler_3.18.0
## [97] rvcheck_0.1.8 AnnotationDbi_1.52.0 memoise_1.1.0
## [100] IRanges_2.24.0 sva_3.38.0 ellipsis_0.3.1
Wei Li, Han Xu, Johannes Köster, and X. Shirley Liu. 2015. “Quality control, modeling, and visualization of CRISPR screens with MAGeCK-VISPR.” https://genomebiology.biomedcentral.com/articles/10.1186/s13059-015-0843-6.
Wei Li, Tengfei Xiao, Han Xu, and X Shirley Liu. 2014. “MAGeCK enables robust identification of essential genes from genome-scale CRISPR/Cas9 knockout screens.” https://genomebiology.biomedcentral.com/articles/10.1186/s13059-014-0554-4.
Yu, Guangchuang. 2018. Enrichplot: Visualization of Functional Enrichment Result. https://github.com/GuangchuangYu/enrichplot.
Yu, Guangchuang, Li-Gen Wang, Yanyan Han, and Qing-Yu He. 2012. “ClusterProfiler: An R Package for Comparing Biological Themes Among Gene Clusters.” OMICS: A Journal of Integrative Biology 16 (5):284–87. https://doi.org/10.1089/omi.2011.0118.