SimFFPE-package {SimFFPE}R Documentation

NGS Read Simulator for FFPE Tissue

Description

This package simulates artifact chimeric reads specifically generated in next-generation sequencing (NGS) process of formalin-fixed paraffin-embedded (FFPE) tissue.

Details

Package: SimFFPE
Type: Package
Title: NGS Read Simulator for FFPE Tissue
Version: 1.2.0
Author: Lanying Wei
Maintainer: Lanying Wei <lanying.wei@uni-muenster.de>
Description: This package simulates artifact chimeric reads specifically generated in next-generation sequencing (NGS) process of formalin-fixed paraffin-embedded (FFPE) tissue.
License: LGPL-3
Encoding: UTF-8
Depends: Biostrings
Imports: dplyr, foreach, doParallel, truncnorm, GenomicRanges, IRanges, Rsamtools, parallel, graphics, stats, utils, methods
Suggests: BiocStyle
biocViews: Sequencing, Alignment, MultipleComparison, SequenceMatching, DataImport
git_url: https://git.bioconductor.org/packages/SimFFPE
git_branch: RELEASE_3_12
git_last_commit: 3d90ef3
git_last_commit_date: 2020-10-27
Date/Publication: 2020-10-28

The NGS (Next-Generation Sequencing) reads from FFPE (Formalin-Fixed Paraffin-Embedded) samples contain numerous artificial chimeric reads. These reads are derived from the combination of two single-stranded DNA (ss-DNA) fragments with short reverse complementary sequences. The combined ss-DNA may come from adjacent or distant regions. This package simulates these artifacts as well as normal reads for FFPE samples. The simulation can cover whole genome, or several chromosomes, or large regions, or whole exome, or targeted regions. It also supports enzymatic / random fragmentation and paired-end / single-end sequencing simulations. Fine-tuning can be performed for desired simulation results, and multi-threading can help reduce the runtime. Please check the package vignette for the guidance of fine-tuning. Index of help topics:

SimFFPE-package         NGS Read Simulator for FFPE Tissue
calcPhredScoreProfile   Estimate Phred score profile for FFPE read
                        simulation
readSimFFPE             Simulate noisy NGS reads of FFPE samples for
                        whole genome / several chromosomes / large
                        regions
targetReadSimFFPE       Simulate noisy NGS reads of FFPE samples in
                        exonic / targeted regions

There are three available functions for NGS read simulation of FFPE samples:

1. calcPhredScoreProfile: Calculate positional Phred score profile from BAM file for read simulation.

2. readSimFFPE: Simulate noisy NGS reads of FFPE samples on whole genome, or several chromosomes, or large regions.

3. targetReadSimFFPE: Simulate noisy NGS reads of FFPE samples in exonic / targeted regions.

Author(s)

Lanying Wei

Maintainer: Lanying Wei <lanying.wei@uni-muenster.de>

See Also

calcPhredScoreProfile, readSimFFPE, targetReadSimFFPE

Examples


PhredScoreProfilePath <- system.file("extdata", "PhredScoreProfile2.txt",
                                     package = "SimFFPE")
PhredScoreProfile <- as.matrix(read.table(PhredScoreProfilePath, skip = 1))
colnames(PhredScoreProfile) <- read.table(PhredScoreProfilePath, 
                                          nrows = 1, 
                                          colClasses = "character")

referencePath <- system.file("extdata", "example.fasta", package = "SimFFPE")
reference <- readDNAStringSet(referencePath)

## Simulate reads of the first three sequences of the reference genome

sourceSeq <- reference[1:3]
outFile1 <- paste0(tempdir(), "/sim1")
readSimFFPE(sourceSeq, referencePath, PhredScoreProfile, outFile1, 
            coverage = 80, enzymeCut = TRUE, threads = 4)

## Simulate reads for targeted regions

bamFilePath <- system.file("extdata", "example.bam", package = "SimFFPE")
regionPath <- system.file("extdata", "regionsBam.txt", package = "SimFFPE")
regions <- read.table(regionPath)
PhredScoreProfile <- calcPhredScoreProfile(bamFilePath, targetRegions = regions)

regionPath <- system.file("extdata", "regionsSim.txt", package = "SimFFPE")
targetRegions <- read.table(regionPath)

outFile <- paste0(tempdir(), "/sim2")
targetReadSimFFPE(referencePath, PhredScoreProfile, targetRegions, outFile,
                  coverage = 120, readLen = 100, meanInsertLen = 150, 
                  sdInsertLen = 40, enzymeCut = FALSE)

[Package SimFFPE version 1.2.0 Index]