test_gene_enrichment {tidybulk} | R Documentation |
test_gene_enrichment() takes as input a 'tbl' (with at least three columns for sample, feature and transcript abundance) or 'SummarizedExperiment' (more convenient if abstracted to tibble with library(tidySummarizedExperiment)) and returns a 'tbl' of gene set information
test_gene_enrichment( .data, .formula, .sample = NULL, .entrez, .abundance = NULL, .contrasts = NULL, methods = c("camera", "roast", "safe", "gage", "padog", "globaltest", "ora"), gene_sets = c("h", "c1", "c2", "c3", "c4", "c5", "c6", "c7", "kegg_disease", "kegg_metabolism", "kegg_signaling"), species, cores = 10, method = NULL ) ## S4 method for signature 'spec_tbl_df' test_gene_enrichment( .data, .formula, .sample = NULL, .entrez, .abundance = NULL, .contrasts = NULL, methods = c("camera", "roast", "safe", "gage", "padog", "globaltest", "ora"), gene_sets = c("h", "c1", "c2", "c3", "c4", "c5", "c6", "c7", "kegg_disease", "kegg_metabolism", "kegg_signaling"), species, cores = 10, method = NULL ) ## S4 method for signature 'tbl_df' test_gene_enrichment( .data, .formula, .sample = NULL, .entrez, .abundance = NULL, .contrasts = NULL, methods = c("camera", "roast", "safe", "gage", "padog", "globaltest", "ora"), gene_sets = c("h", "c1", "c2", "c3", "c4", "c5", "c6", "c7", "kegg_disease", "kegg_metabolism", "kegg_signaling"), species, cores = 10, method = NULL ) ## S4 method for signature 'tidybulk' test_gene_enrichment( .data, .formula, .sample = NULL, .entrez, .abundance = NULL, .contrasts = NULL, methods = c("camera", "roast", "safe", "gage", "padog", "globaltest", "ora"), gene_sets = c("h", "c1", "c2", "c3", "c4", "c5", "c6", "c7", "kegg_disease", "kegg_metabolism", "kegg_signaling"), species, cores = 10, method = NULL ) ## S4 method for signature 'SummarizedExperiment' test_gene_enrichment( .data, .formula, .sample = NULL, .entrez, .abundance = NULL, .contrasts = NULL, methods = c("camera", "roast", "safe", "gage", "padog", "globaltest", "ora"), gene_sets = c("h", "c1", "c2", "c3", "c4", "c5", "c6", "c7", "kegg_disease", "kegg_metabolism", "kegg_signaling"), species, cores = 10, method = NULL ) ## S4 method for signature 'RangedSummarizedExperiment' test_gene_enrichment( .data, .formula, .sample = NULL, .entrez, .abundance = NULL, .contrasts = NULL, methods = c("camera", "roast", "safe", "gage", "padog", "globaltest", "ora"), gene_sets = c("h", "c1", "c2", "c3", "c4", "c5", "c6", "c7", "kegg_disease", "kegg_metabolism", "kegg_signaling"), species, cores = 10, method = NULL )
.data |
A 'tbl' (with at least three columns for sample, feature and transcript abundance) or 'SummarizedExperiment' (more convenient if abstracted to tibble with library(tidySummarizedExperiment)) |
.formula |
A formula with no response variable, representing the desired linear model |
.sample |
The name of the sample column |
.entrez |
The ENTREZ ID of the transcripts/genes |
.abundance |
The name of the transcript/gene abundance column |
.contrasts |
= NULL, |
methods |
A character vector. One or 3 or more methods to use in the testing (currently EGSEA errors if 2 are used). Type EGSEA::egsea.base() to see the supported GSE methods. |
gene_sets |
A character vector or a list. It can take one or more of the following built-in collections as a character vector: c("h", "c1", "c2", "c3", "c4", "c5", "c6", "c7", "kegg_disease", "kegg_metabolism", "kegg_signaling"), to be used with EGSEA buildIdx. c1 is human specific. Alternatively, a list of user-supplied gene sets can be provided, to be used with EGSEA buildCustomIdx. In that case, each gene set is a character vector of Entrez IDs and the names of the list are the gene set names. |
species |
A character. It can be human, mouse or rat. |
cores |
An integer. The number of cores available |
method |
DEPRECATED. Please use methods. |
'r lifecycle::badge("maturing")'
This wrapper executes ensemble gene enrichment analyses of the dataset using EGSEA (DOI:0.12688/f1000research.12544.1)
dge = data keep_abundant( factor_of_interest = !!as.symbol(parse_formula(.formula)[[1]]), !!.sample, !!.entrez, !!.abundance )
# Make sure transcript names are adjacent [...] as_matrix(rownames = !!.entrez) edgeR::DGEList(counts = .)
idx = buildIdx(entrezIDs = rownames(dge), species = species, msigdb.gsets = msigdb.gsets, kegg.exclude = kegg.exclude)
dge
# Calculate weights limma::voom(design, plot = FALSE)
# Execute EGSEA egsea( contrasts = my_contrasts, baseGSEAs = methods, gs.annots = idx, sort.by = "med.rank", num.threads = cores, report = FALSE )
A consistent object (to the input)
A consistent object (to the input)
A consistent object (to the input)
A consistent object (to the input)
A consistent object (to the input)
A consistent object (to the input)
## Not run: df_entrez = tidybulk::se_mini |> tidybulk() |> as_tibble() |> symbol_to_entrez( .transcript = feature, .sample = sample) df_entrez = aggregate_duplicates(df_entrez, aggregation_function = sum, .sample = sample, .transcript = entrez, .abundance = count) library("EGSEA") test_gene_enrichment( df_entrez, ~ condition, .sample = sample, .entrez = entrez, .abundance = count, methods = c("roast" , "safe", "gage" , "padog" , "globaltest", "ora" ), gene_sets = c("h", "c1", "c2", "c3", "c4", "c5", "c6", "c7", "kegg_disease", "kegg_metabolism", "kegg_signaling"), species="human", cores = 2 ) ## End(Not run)