derepFastq {dada2} | R Documentation |
Read in and dereplicate a fastq file.
Description
A custom interface to FastqStreamer
for dereplicating amplicon sequences from fastq or compressed fastq files,
while also controlling peak memory requirement to support large files.
Usage
derepFastq(fls, n = 1e+06, verbose = FALSE, qualityType = "Auto")
Arguments
fls |
(Required). character .
The file path(s) to the fastq file(s), or a directory containing fastq file(s).
Compressed file formats such as .fastq.gz and .fastq.bz2 are supported.
|
n |
(Optional). numeric(1) .
The maximum number of records (reads) to parse and dereplicate
at any one time. This controls the peak memory requirement
so that large fastq files are supported.
Default is 1e6 , one-million reads.
See FastqStreamer for details on this parameter,
which is passed on.
|
verbose |
(Optional). Default FALSE.
If TRUE, throw standard R message s
on the intermittent and final status of the dereplication.
|
qualityType |
(Optional). character(1) .
The quality encoding of the fastq file(s). "Auto" (the default) means to
attempt to auto-detect the encoding. This may fail for PacBio files with
uniformly high quality scores, in which case use "FastqQuality". This
parameter is passed on to readFastq ; see
information there for details.
|
Value
A derep-class
object or list of such objects.
Examples
# Test that chunk-size, `n`, does not affect the result.
testFastq = system.file("extdata", "sam1F.fastq.gz", package="dada2")
derep1 = derepFastq(testFastq, verbose = TRUE)
derep1.35 = derepFastq(testFastq, n = 35, verbose = TRUE)
all.equal(getUniques(derep1), getUniques(derep1.35)[names(getUniques(derep1))])
[Package
dada2 version 1.20.0
Index]