test_de {glmGamPoi} | R Documentation |
Conduct a quasi-likelihood ratio test for a Gamma-Poisson fit.
test_de( fit, contrast, reduced_design = NULL, full_design = fit$model_matrix, subset_to = NULL, pseudobulk_by = NULL, pval_adjust_method = "BH", sort_by = NULL, decreasing = FALSE, n_max = Inf, verbose = FALSE )
fit |
object of class |
contrast |
The contrast to test. Can be a single column name (quoted or as a string)
that is removed from the full model matrix of |
reduced_design |
a specification of the reduced design used as a comparison to see what
how much better |
full_design |
option to specify an alternative |
subset_to |
a vector with the same length as |
pseudobulk_by |
a vector with the same length as |
pval_adjust_method |
one of the p-value adjustment method from
p.adjust.methods. Default: |
sort_by |
the name of the column or an expression used to sort the result. If |
decreasing |
boolean to decide if the result is sorted increasing or decreasing
order. Default: |
n_max |
the maximum number of rows to return. Default: |
verbose |
a boolean that indicates if information about the individual steps are printed
while fitting the GLM. Default: |
a data.frame
with the following columns
the rownames of the input data
the p-values of the quasi-likelihood ratio test
the adjusted p-values returned from p.adjust()
the F-statistic: F = (Dev_full - Dev_red) / (df_1 * disp_ql-shrunken)
the degrees of freedom of the test: ncol(design) - ncol(reduced_design)
the degrees of freedom of the fit: ncol(data) - ncol(design) + df_0
the log2-fold change. If the alternative model is specified by reduced_design
will
be NA
.
Lund, S. P., Nettleton, D., McCarthy, D. J., & Smyth, G. K. (2012). Detecting differential expression in RNA-sequence data using quasi-likelihood with shrunken dispersion estimates. Statistical Applications in Genetics and Molecular Biology, 11(5). https://doi.org/10.1515/1544-6115.1826.
Y <- matrix(rnbinom(n = 30 * 100, mu = 4, size = 0.3), nrow = 30, ncol = 100) annot <- data.frame(sample = sample(LETTERS[1:6], size = 100, replace = TRUE), cont1 = rnorm(100), cont2 = rnorm(100, mean = 30)) annot$condition <- ifelse(annot$sample %in% c("A", "B", "C"), "ctrl", "treated") head(annot) se <- SummarizedExperiment::SummarizedExperiment(Y, colData = annot) fit <- glm_gp(se, design = ~ condition + cont1 + cont2) # Test with reduced design res <- test_de(fit, reduced_design = ~ condition + cont1) head(res) # Test with contrast argument, the results are identical res2 <- test_de(fit, contrast = cont2) head(res2) # The column names of fit$Beta are valid variables in the contrast argument colnames(fit$Beta) # You can also have more complex contrasts: # the following compares cont1 vs cont2: test_de(fit, cont1 - cont2, n_max = 4) # You can also sort the output test_de(fit, cont1 - cont2, n_max = 4, sort_by = "pval") test_de(fit, cont1 - cont2, n_max = 4, sort_by = - abs(f_statistic)) # If the data has multiple samples, it is a good # idea to aggregate the cell counts by samples. # This is called "pseudobulk". test_de(fit, contrast = "conditiontreated", n_max = 4, pseudobulk_by = sample) # You can also do the pseudobulk only on a subset of cells: cell_types <- sample(c("Tcell", "Bcell", "Makrophages"), size = 100, replace = TRUE) test_de(fit, contrast = "conditiontreated", n_max = 4, pseudobulk_by = sample, subset_to = cell_types == "Bcell") # Be care full, if you included the cell type information in # the original fit, after subsetting the design matrix would # be degenerate. To fix this, specify the full_design in 'test_de()' SummarizedExperiment::colData(se)$ct <- cell_types fit_with_celltype <- glm_gp(se, design = ~ condition + cont1 + cont2 + ct) test_de(fit_with_celltype, contrast = cont1, n_max = 4, full_design = ~ condition + cont1 + cont2, pseudobulk_by = sample, subset_to = ct == "Bcell")