DA_Seurat {benchdamic}R Documentation

DA_Seurat

Description

Fast run for Seurat differential abundance detection method.

Usage

DA_Seurat(
  object,
  pseudo_count = FALSE,
  test.use = "wilcox",
  contrast,
  norm = c("TMM", "TMMwsp", "RLE", "upperquartile", "posupperquartile", "none",
    "ratio", "poscounts", "iterate", "TSS", "CSSmedian", "CSSdefault"),
  verbose = TRUE
)

Arguments

object

phyloseq object.

pseudo_count

add 1 to all counts if TRUE (default pseudo_count = FALSE).

test.use

Denotes which test to use. Available options are:

  • "wilcox" : Identifies differentially expressed genes between two groups of cells using a Wilcoxon Rank Sum test (default)

  • "bimod" : Likelihood-ratio test for single cell gene expression, (McDavid et al., Bioinformatics, 2013)

  • "roc" : Identifies 'markers' of gene expression using ROC analysis. For each gene, evaluates (using AUC) a classifier built on that gene alone, to classify between two groups of cells. An AUC value of 1 means that expression values for this gene alone can perfectly classify the two groupings (i.e. Each of the cells in cells.1 exhibit a higher level than each of the cells in cells.2). An AUC value of 0 also means there is perfect classification, but in the other direction. A value of 0.5 implies that the gene has no predictive power to classify the two groups. Returns a 'predictive power' (abs(AUC-0.5) * 2) ranked matrix of putative differentially expressed genes.

  • "t" : Identify differentially expressed genes between two groups of cells using the Student's t-test.

  • "negbinom" : Identifies differentially expressed genes between two groups of cells using a negative binomial generalized linear model. Use only for UMI-based datasets

  • "poisson" : Identifies differentially expressed genes between two groups of cells using a poisson generalized linear model. Use only for UMI-based datasets

  • "LR" : Uses a logistic regression framework to determine differentially expressed genes. Constructs a logistic regression model predicting group membership based on each feature individually and compares this to a null model with a likelihood ratio test.

  • "MAST" : Identifies differentially expressed genes between two groups of cells using a hurdle model tailored to scRNA-seq data. Utilizes the MAST package to run the DE testing.

  • "DESeq2" : Identifies differentially expressed genes between two groups of cells based on a model using DESeq2 which uses a negative binomial distribution (Love et al, Genome Biology, 2014).This test does not support pre-filtering of genes based on average difference (or percent detection rate) between cell groups. However, genes may be pre-filtered based on their minimum detection rate (min.pct) across both cell groups. To use this method, please install DESeq2, using the instructions at https://bioconductor.org/packages/release/bioc/html/DESeq2.html

contrast

character vector with exactly three elements: the name of a factor in the design formula, the name of the numerator level for the fold change, and the name of the denominator level for the fold change.

norm

name of the normalization method used to compute the normalization factors to use in the differential abundance analysis. If norm is equal to "TMM", "TMMwsp", "RLE", "upperquartile", "posupperquartile", "CSSmedian", "CSSdefault", "TSS" the scaling factors are automatically transformed into normalization factors.

verbose

an optional logical value. If TRUE, information about the steps of the algorithm is printed. Default verbose = TRUE.

Value

A list object containing the matrix of p-values 'pValMat', the matrix of summary statistics for each tag 'statInfo', and a suggested 'name' of the final object considering the parameters passed to the function.

See Also

CreateSeuratObject to create the Seurat object, AddMetaData to add metadata information, NormalizeData to compute the normalization for the counts, FindVariableFeatures to estimate the mean-variance trend, ScaleData to scale and center features in the dataset, and FindMarkers to perform differential abundance analysis.

Examples

set.seed(1)
# Create a very simple phyloseq object
counts <- matrix(rnbinom(n = 60, size = 3, prob = 0.5), nrow = 10, ncol = 6)
metadata <- data.frame("Sample" = c("S1", "S2", "S3", "S4", "S5", "S6"),
                       "group" = as.factor(c("A", "A", "A", "B", "B", "B")))
ps <- phyloseq::phyloseq(phyloseq::otu_table(counts, taxa_are_rows = TRUE),
                         phyloseq::sample_data(metadata))
# No use of scaling factors
ps_NF <- norm_edgeR(object = ps, method = "none")
# The phyloseq object now contains the scaling factors:
scaleFacts <- phyloseq::sample_data(ps_NF)[, "NF.none"]
head(scaleFacts)
# Differential abundance
DA_Seurat(object = ps_NF, contrast = c("group","B","A"), norm = "none")

[Package benchdamic version 1.0.0 Index]