getCoreGene {PhyloProfile} | R Documentation |
Identify core genes for a list of selected (super)taxa. The identified core genes must be present in at least a certain proportion of species in each selected (super)taxon (identified via percentCutoff) and that criteria must be fullfilled for a certain percentage of selected taxa or all of them (determined via coreCoverage).
getCoreGene(rankName, taxaCore = c("none"), profileDt, taxaCount, var1Cutoff = c(0, 1), var2Cutoff = c(0, 1), percentCutoff = c(0, 1), coreCoverage = 100)
rankName |
working taxonomy rank (e.g. "species", "genus", "family") |
taxaCore |
list of selected taxon names |
profileDt |
dataframe contains the full processed phylogenetic profiles (see ?fullProcessedProfile or ?parseInfoProfile) |
taxaCount |
dataframe counting present taxa in each supertaxon |
var1Cutoff |
cutoff for var1. Default = c(0, 1). |
var2Cutoff |
cutoff for var2. Default = c(0, 1). |
percentCutoff |
cutoff for percentage of species present in each supertaxon. Default = c(0, 1). |
coreCoverage |
the least percentage of selected taxa should be considered. Default = 1. |
A list of identified core genes.
Vinh Tran tran@bio.uni-frankfurt.de
parseInfoProfile
for creating a full processed
profile dataframe
data("fullProcessedProfile", package="PhyloProfile") rankName <- "class" refTaxon <- "Mammalia" taxaCore <- c("Mammalia", "Saccharomycetes", "Insecta") profileDt <- fullProcessedProfile taxonIDs <- levels(as.factor(fullProcessedProfile$ncbiID)) sortedInputTaxa <- sortInputTaxa( taxonIDs, rankName, refTaxon, NULL ) taxaCount <- plyr::count(sortedInputTaxa, "supertaxon") var1Cutoff <- c(0.75, 1.0) var2Cutoff <- c(0.75, 1.0) percentCutoff <- c(0.0, 1.0) coreCoverage <- 100 getCoreGene( rankName, taxaCore, profileDt, taxaCount, var1Cutoff, var2Cutoff, percentCutoff, coreCoverage )