mut_matrix_stranded {MutationalPatterns}R Documentation

Make mutation count matrix of 96 trinucleotides with strand information

Description

Make a mutation count matrix with 192 features: 96 trinucleotides and 2 strands, these can be transcription or replication strand

Usage

mut_matrix_stranded(vcf_list, ref_genome, ranges, mode = "transcription")

Arguments

vcf_list

List of collapsed vcf objects

ref_genome

BSGenome reference genome object

ranges

GRanges object with the genomic ranges of: 1. (transcription mode) the gene bodies with strand (+/-) information, or 2. (replication mode) the replication strand with 'strand_info' metadata

mode

"transcription" or "replication", default = "transcription"

Value

192 mutation count matrix (96 X 2 strands)

See Also

read_vcfs_as_granges, mut_matrix, mut_strand

Examples

## See the 'read_vcfs_as_granges()' example for how we obtained the
## following data:
vcfs <- readRDS(system.file("states/read_vcfs_as_granges_output.rds",
                package="MutationalPatterns"))

## Load the corresponding reference genome.
ref_genome = "BSgenome.Hsapiens.UCSC.hg19"
library(ref_genome, character.only = TRUE)

## Transcription strand analysis:
## You can obtain the known genes from the UCSC hg19 dataset using
## Bioconductor:
# if (!requireNamespace("BiocManager", quietly=TRUE))
    # install.packages("BiocManager")
# BiocManager::install("TxDb.Hsapiens.UCSC.hg19.knownGene")
# library("TxDb.Hsapiens.UCSC.hg19.knownGene")

## For this example, we preloaded the data for you:
genes_hg19 <- readRDS(system.file("states/genes_hg19.rds",
                        package="MutationalPatterns"))

mut_mat_s = mut_matrix_stranded(vcfs, ref_genome, genes_hg19, 
                                mode = "transcription")

## Replication strand analysis:
## Read example bed file with replication direction annotation
repli_file = system.file("extdata/ReplicationDirectionRegions.bed", 
                          package = "MutationalPatterns")
repli_strand = read.table(repli_file, header = TRUE)
repli_strand_granges = GRanges(seqnames = repli_strand$Chr, 
                               ranges = IRanges(start = repli_strand$Start + 1, 
                               end = repli_strand$Stop), 
                               strand_info = repli_strand$Class)
## UCSC seqlevelsstyle
seqlevelsStyle(repli_strand_granges) = "UCSC"
# The levels determine the order in which the features 
# will be countend and plotted in the downstream analyses
# You can specify your preferred order of the levels:
repli_strand_granges$strand_info = factor(repli_strand_granges$strand_info, levels = c("left", "right"))

mut_mat_s_rep = mut_matrix_stranded(vcfs, ref_genome, repli_strand_granges,
                                mode = "replication")


[Package MutationalPatterns version 2.0.0 Index]