calculate_motif_enrichment {transite} | R Documentation |
This function is used to calculate binding site enrichment / depletion scores between predefined foreground and background sequence sets. Significance levels of enrichment values are obtained by Monte Carlo tests.
calculate_motif_enrichment( foreground_scores_df, background_scores_df, background_total_sites, background_absolute_hits, n_transcripts_foreground, max_fg_permutations = 1e+06, min_fg_permutations = 1000, e = 5, p_adjust_method = "BH" )
foreground_scores_df |
result of |
background_scores_df |
result of |
background_total_sites |
number of potential binding sites per sequence
(returned by |
background_absolute_hits |
number of putative binding sites per sequence
(returned by |
n_transcripts_foreground |
number of sequences in the foreground set |
max_fg_permutations |
maximum number of foreground permutations performed in Monte Carlo test for enrichment score |
min_fg_permutations |
minimum number of foreground permutations performed in Monte Carlo test for enrichment score |
e |
integer-valued stop criterion for enrichment score Monte Carlo
test: aborting
permutation process after
observing |
p_adjust_method |
adjustment of p-values from Monte Carlo tests to
avoid alpha error
accumulation, see |
A data frame with the following columns:
motif_id | the motif identifier that is used in the original motif library |
motif_rbps | the gene symbol of the RNA-binding protein(s) |
enrichment | binding site enrichment between foreground and background sequences |
p_value | unadjusted p-value from Monte Carlo test |
p_value_n | number of Monte Carlo test permutations |
adj_p_value | adjusted p-value from Monte Carlo test (usually FDR) |
Other matrix functions:
run_matrix_spma()
,
run_matrix_tsma()
,
score_transcripts_single_motif()
,
score_transcripts()
foreground_seqs <- c("CAGUCAAGACUCC", "AAUUGGUGUCUGGAUACUUCCCUGUACAU", "AGAU", "CCAGUAA") background_seqs <- c(foreground_seqs, "CAACAGCCUUAAUU", "CUUUGGGGAAU", "UCAUUUUAUUAAA", "AUCAAAUUA", "GACACUUAAAGAUCCU", "UAGCAUUAACUUAAUG", "AUGGA", "GAAGAGUGCUCA", "AUAGAC", "AGUUC") foreground_scores <- score_transcripts(foreground_seqs, cache = FALSE) background_scores <- score_transcripts(background_seqs, cache = FALSE) enrichments_df <- calculate_motif_enrichment(foreground_scores$df, background_scores$df, background_scores$total_sites, background_scores$absolute_hits, length(foreground_seqs), max_fg_permutations = 1000 )