SimFFPE-package {SimFFPE} | R Documentation |
This package simulates artifact chimeric reads specifically generated in next-generation sequencing (NGS) process of formalin-fixed paraffin-embedded (FFPE) tissue.
Package: | SimFFPE |
Type: | Package |
Title: | NGS Read Simulator for FFPE Tissue |
Version: | 1.2.0 |
Author: | Lanying Wei |
Maintainer: | Lanying Wei <lanying.wei@uni-muenster.de> |
Description: | This package simulates artifact chimeric reads specifically generated in next-generation sequencing (NGS) process of formalin-fixed paraffin-embedded (FFPE) tissue. |
License: | LGPL-3 |
Encoding: | UTF-8 |
Depends: | Biostrings |
Imports: | dplyr, foreach, doParallel, truncnorm, GenomicRanges, IRanges, Rsamtools, parallel, graphics, stats, utils, methods |
Suggests: | BiocStyle |
biocViews: | Sequencing, Alignment, MultipleComparison, SequenceMatching, DataImport |
git_url: | https://git.bioconductor.org/packages/SimFFPE |
git_branch: | RELEASE_3_12 |
git_last_commit: | 3d90ef3 |
git_last_commit_date: | 2020-10-27 |
Date/Publication: | 2020-10-28 |
The NGS (Next-Generation Sequencing) reads from FFPE (Formalin-Fixed Paraffin-Embedded) samples contain numerous artificial chimeric reads. These reads are derived from the combination of two single-stranded DNA (ss-DNA) fragments with short reverse complementary sequences. The combined ss-DNA may come from adjacent or distant regions. This package simulates these artifacts as well as normal reads for FFPE samples. The simulation can cover whole genome, or several chromosomes, or large regions, or whole exome, or targeted regions. It also supports enzymatic / random fragmentation and paired-end / single-end sequencing simulations. Fine-tuning can be performed for desired simulation results, and multi-threading can help reduce the runtime. Please check the package vignette for the guidance of fine-tuning. Index of help topics:
SimFFPE-package NGS Read Simulator for FFPE Tissue calcPhredScoreProfile Estimate Phred score profile for FFPE read simulation readSimFFPE Simulate noisy NGS reads of FFPE samples for whole genome / several chromosomes / large regions targetReadSimFFPE Simulate noisy NGS reads of FFPE samples in exonic / targeted regions
There are three available functions for NGS read simulation of FFPE samples:
1. calcPhredScoreProfile
: Calculate positional Phred score profile
from BAM file for read simulation.
2. readSimFFPE
: Simulate noisy NGS reads of FFPE samples on whole
genome, or several chromosomes, or large regions.
3. targetReadSimFFPE
: Simulate noisy NGS reads of FFPE samples in
exonic / targeted regions.
Lanying Wei
Maintainer: Lanying Wei <lanying.wei@uni-muenster.de>
calcPhredScoreProfile
, readSimFFPE
,
targetReadSimFFPE
PhredScoreProfilePath <- system.file("extdata", "PhredScoreProfile2.txt", package = "SimFFPE") PhredScoreProfile <- as.matrix(read.table(PhredScoreProfilePath, skip = 1)) colnames(PhredScoreProfile) <- read.table(PhredScoreProfilePath, nrows = 1, colClasses = "character") referencePath <- system.file("extdata", "example.fasta", package = "SimFFPE") reference <- readDNAStringSet(referencePath) ## Simulate reads of the first three sequences of the reference genome sourceSeq <- reference[1:3] outFile1 <- paste0(tempdir(), "/sim1") readSimFFPE(sourceSeq, referencePath, PhredScoreProfile, outFile1, coverage = 80, enzymeCut = TRUE, threads = 4) ## Simulate reads for targeted regions bamFilePath <- system.file("extdata", "example.bam", package = "SimFFPE") regionPath <- system.file("extdata", "regionsBam.txt", package = "SimFFPE") regions <- read.table(regionPath) PhredScoreProfile <- calcPhredScoreProfile(bamFilePath, targetRegions = regions) regionPath <- system.file("extdata", "regionsSim.txt", package = "SimFFPE") targetRegions <- read.table(regionPath) outFile <- paste0(tempdir(), "/sim2") targetReadSimFFPE(referencePath, PhredScoreProfile, targetRegions, outFile, coverage = 120, readLen = 100, meanInsertLen = 150, sdInsertLen = 40, enzymeCut = FALSE)