test_gene_rank {tidybulk} | R Documentation |
test_gene_rank() takes as input a 'tbl' (with at least three columns for sample, feature and transcript abundance) or 'SummarizedExperiment' (more convenient if abstracted to tibble with library(tidySummarizedExperiment)) and returns a 'tbl' with the GSEA statistics
test_gene_rank( .data, .entrez, .arrange_desc, species, .sample = NULL, gene_sets = NULL, gene_set = NULL ) ## S4 method for signature 'spec_tbl_df' test_gene_rank( .data, .entrez, .arrange_desc, species, .sample = NULL, gene_sets = c("h", "c1", "c2", "c3", "c4", "c5", "c6", "c7"), gene_set = NULL ) ## S4 method for signature 'tbl_df' test_gene_rank( .data, .entrez, .arrange_desc, species, .sample = NULL, gene_sets = c("h", "c1", "c2", "c3", "c4", "c5", "c6", "c7"), gene_set = NULL ) ## S4 method for signature 'tidybulk' test_gene_rank( .data, .entrez, .arrange_desc, species, .sample = NULL, gene_sets = c("h", "c1", "c2", "c3", "c4", "c5", "c6", "c7"), gene_set = NULL ) ## S4 method for signature 'SummarizedExperiment' test_gene_rank( .data, .entrez, .arrange_desc, species, .sample = NULL, gene_sets = NULL, gene_set = NULL ) ## S4 method for signature 'RangedSummarizedExperiment' test_gene_rank( .data, .entrez, .arrange_desc, species, .sample = NULL, gene_sets = NULL, gene_set = NULL )
.data |
A 'tbl' (with at least three columns for sample, feature and transcript abundance) or 'SummarizedExperiment' (more convenient if abstracted to tibble with library(tidySummarizedExperiment)) |
.entrez |
The ENTREZ ID of the transcripts/genes |
.arrange_desc |
A column name of the column to arrange in decreasing order |
species |
A character. For example, human or mouse. MSigDB uses the latin species names (e.g., \"Mus musculus\", \"Homo sapiens\") |
.sample |
The name of the sample column |
gene_sets |
A character vector or a list. It can take one or more of the following built-in collections as a character vector: c("h", "c1", "c2", "c3", "c4", "c5", "c6", "c7", "kegg_disease", "kegg_metabolism", "kegg_signaling"), to be used with EGSEA buildIdx. c1 is human specific. Alternatively, a list of user-supplied gene sets can be provided, to be used with EGSEA buildCustomIdx. In that case, each gene set is a character vector of Entrez IDs and the names of the list are the gene set names. |
gene_set |
DEPRECATED. Use gene_sets instead. |
This wrapper execute gene enrichment analyses of the dataset using a list of transcripts and GSEA. This wrapper uses clusterProfiler (DOI: doi.org/10.1089/omi.2011.0118) on the back-end.
Undelying method: # Get gene sets signatures msigdbr::msigdbr(species = species)
# Filter specific gene_sets if specified. This was introduced to speed up examples executionS when( !is.null(gene_sets ) ~ filter(., gs_cat ~ (.) )
# Execute calculation nest(data = -gs_cat) mutate(fit = map( data, ~ clusterProfiler::GSEA( my_entrez_rank, TERM2GENE=.x pvalueCutoff = 1 )
))
A consistent object (to the input)
A 'spec_tbl_df' object
A 'tbl_df' object
A 'tidybulk' object
A 'SummarizedExperiment' object
A 'RangedSummarizedExperiment' object
## Not run: df_entrez = tidybulk::se_mini |> tidybulk() |> as_tibble() |> symbol_to_entrez( .transcript = feature, .sample = sample) df_entrez = aggregate_duplicates(df_entrez, aggregation_function = sum, .sample = sample, .transcript = entrez, .abundance = count) df_entrez = mutate(df_entrez, do_test = feature %in% c("TNFRSF4", "PLCH2", "PADI4", "PAX7")) df_entrez = df_entrez %>% test_differential_abundance(~ condition) test_gene_rank( df_entrez, .sample = sample, .entrez = entrez, species="Homo sapiens", gene_sets =c("C2"), .arrange_desc = logFC ) ## End(Not run)