process_vcf {supersigs} | R Documentation |
Transform a VCF object into a data frame of trinucleotide mutations with flanking bases in a wide matrix format. The function assumes that the VCF object contains only one sample and that each row in rowRanges represents an observed mutation in the sample.
process_vcf(vcf)
vcf |
a VCF object (from |
process_vcf
returns a data frame of mutations,
one row per mutation
# Use example vcf from VariantAnnotation suppressPackageStartupMessages({library(VariantAnnotation)}) fl <- system.file("extdata", "chr22.vcf.gz", package="VariantAnnotation") vcf <- VariantAnnotation::readVcf(fl, "hg19") # Subset to first sample vcf <- vcf[, 1] # Subset to row positions with homozygous or heterozygous alt positions <- geno(vcf)$GT != "0|0" vcf <- vcf[positions[, 1],] colData(vcf)$age <- 50 # Add patient age to colData (optional) # Run function dt <- process_vcf(vcf) head(dt)