taxonomyFromRefpkg {clstutils}R Documentation

Extract taxonomic information from a refpkg.

Description

Construct a data.frame providing the lineage of each sequence represented in the reference package.

Usage

taxonomyFromRefpkg(path, seqnames, lowest_rank = NA)

Arguments

path

path to a refpkg directory

seqnames

optional character vector of sequence names. If provided, determines the order of rows in $taxTab

lowest_rank

name of the most specific (ie, rightmost) rank to include. Default is the name of the rightmost column in refpkg_contents$taxonomy

Value

A list with the following elements:

taxNames

a named character vector of taxonomic names (names are tax_ids)

taxTab

a data.frame in which each row corresponds to a reference sequence and contains a tax_id followed by the corresponding lineage (columns are "root"...lowest_rank)

Author(s)

Noah Hoffman

References

The decsription and specification for a reference package can be found in the project repository in github: https://github.com/fhcrc/taxtastic

Scripts and tools for creating reference packages are provided in the python package taxonomy, also available from the taxtastic project site.

See Also

refpkgContents

Examples

archive <- 'vaginal_16s.refpkg.tar.gz'
destdir <- tempdir()
system(gettextf('tar -xzf %s --directory="%s"',
                system.file('extdata',archive,package='clstutils'),
                destdir))
refpkg <- file.path(destdir, sub('.tar.gz','',archive))
reftax <- taxonomyFromRefpkg(refpkg)
str(reftax)

[Package clstutils version 1.38.0 Index]