agglomerate-methods {mia} | R Documentation |
agglomerateByRank
can be used to sum up data based on the association
to certain taxonomic ranks given as rowData
. Only available
taxonomicRanks
can be used.
## S4 method for signature 'SummarizedExperiment' agglomerateByRank( x, rank = taxonomyRanks(x)[1], onRankOnly = FALSE, na.rm = FALSE, empty.fields = c(NA, "", " ", "\t", "-", "_"), ... ) ## S4 method for signature 'SingleCellExperiment' agglomerateByRank(x, ..., altexp = NULL, strip_altexp = TRUE) ## S4 method for signature 'TreeSummarizedExperiment' agglomerateByRank(x, ..., agglomerateTree = FALSE)
x |
a
|
rank |
a single character defining a taxonomic rank. Must be a value of
|
onRankOnly |
|
na.rm |
|
empty.fields |
a |
... |
arguments passed to
|
altexp |
String or integer scalar specifying an alternative experiment containing the input data. |
strip_altexp |
|
agglomerateTree |
|
Based on the available taxonomic data and its structure setting
onRankOnly = TRUE
has certain implications on the interpretability of
your results. If no loops exist (loops meaning two higher ranks containing
the same lower rank), the results should be comparable. you can check for
loops using detectLoop
.
Agglomeration sum up values of assays at specified taxonomic level. Certain assays, e.g. those that include binary or negative values, can lead to meaningless values, when values are summed. In those cases, consider doing agglomeration first and then transformation.
A taxonomically-agglomerated, optionally-pruned object of the same
class as x
.
mergeRows
,
sumCountsAcrossFeatures
data(GlobalPatterns) # print the available taxonomic ranks colnames(rowData(GlobalPatterns)) taxonomyRanks(GlobalPatterns) # agglomerate at the Family taxonomic rank x1 <- agglomerateByRank(GlobalPatterns, rank="Family") ## How many taxa before/after agglomeration? nrow(GlobalPatterns) nrow(x1) # with agglomeration of the tree x2 <- agglomerateByRank(GlobalPatterns, rank="Family", agglomerateTree = TRUE) nrow(x2) # same number of rows, but rowTree(x1) # ... different rowTree(x2) # ... tree # If assay contains binary or negative values, summing might lead to meaningless # values, and you will get a warning. In these cases, you might want to do # agglomeration again at chosen taxonomic level. tse <- transformSamples(GlobalPatterns, method = "pa") tse <- agglomerateByRank(tse, rank = "Genus") tse <- transformSamples(tse, method = "pa") # removing empty labels by setting na.rm = TRUE sum(is.na(rowData(GlobalPatterns)$Family)) x3 <- agglomerateByRank(GlobalPatterns, rank="Family", na.rm = TRUE) nrow(x3) # different from x2 # Because all the rownames are from the same rank, rownames do not include # prefixes, in this case "Family:". print(rownames(x3[1:3,])) # To add them, use getTaxonomyLabels function. rownames(x3) <- getTaxonomyLabels(x3, with_rank = TRUE) print(rownames(x3[1:3,])) # use 'remove_empty_ranks' to remove columns that include only NAs x4 <- agglomerateByRank(GlobalPatterns, rank="Phylum", remove_empty_ranks = TRUE) head(rowData(x4)) ## Look at enterotype dataset... data(enterotype) ## print the available taxonomic ranks. Shows only 1 rank available ## not useful for agglomerateByRank taxonomyRanks(enterotype)