normalize {SigsPack} | R Documentation |
Normalizes the catalogues to a target distribution (e.g. to match the distribution of the reference signatures).
normalize(mut_cat, source_context, target_context = get(utils::data("hg19context_freq", package = "SigsPack")))
mut_cat |
mutational catalogues (96 by n, n being the amounts of catalogues) that will be normalized. The tri-nucleotide contexts are expected to be in the default lexicographical order (see simulated data or cosmicSigs) |
source_context |
Distribution of tri-nucleotides in the source region. |
target_context |
Distribution of tri-nucleotides in the target region. Defaults to the context frequencies of BSgenome.Hsapiens.UCSC.hg19 since that corresponds to the COSMIC signatures |
mutational catalogues (96 by n, n being the amounts of catalogues) normalized to match the target distribution (context)
The output from get_context_freq() can be used as input to this function
# this is a toy example: #create mutational catalogue: sim_data <- create_mut_catalogues(1, 500)[['catalogues']] # get trinucleotide frequencies for the genome: genome_context <- get_context_freq(BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19) #get trinucleotide frequencies for a specific region: gr<-GenomicRanges::GRanges(seqnames=c("chr1"), ranges=IRanges::IRanges(start=c(100000),end=c(1000000)), strand=c("+")) region_context<-get_context_freq(BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19, gr) #normalize data: normalized_mut_cat <- normalize(sim_data, region_context, genome_context) ## Not run: # get the tri-nucleotide distribution of an exome region exome_contexts <- get_context_freq(BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19, 'example_exome.bed') # normalize the mutational catalogue to match the COSMIC signatures normalized_mut_cat <- normalize(mut_cat, exome_contexts, hg19context_freq) ## End(Not run)