simData {muscat} | R Documentation |
Simulation of complex scRNA-seq data
simData( x, n_genes = 500, n_cells = 300, probs = NULL, p_dd = diag(6)[1, ], p_type = 0, lfc = 2, rel_lfc = NULL )
x |
|
n_genes |
# of genes to simulate. |
n_cells |
# of cells to simulate. Either a single numeric or a range to sample from. |
probs |
a list of length 3 containing probabilities of a cell belonging to each cluster, sample, and group, respectively. List elements must be NULL (equal probabilities) or numeric values in [0, 1] that sum to 1. |
p_dd |
numeric vector of length 6. Specifies the probability of a gene being EE, EP, DE, DP, DM, or DB, respectively. |
p_type |
numeric. Probaility of EE/EP gene being a type-gene. If a gene is of class "type" in a given cluster, a unique mean will be used for that gene in the respective cluster. |
lfc |
numeric value to use as mean logFC for DE, DP, DM, and DB type of genes. |
rel_lfc |
numeric vector of relative logFCs for each cluster.
Should be of length |
simData
simulates multiple clusters and samples
across 2 experimental conditions from a real scRNA-seq data set.
a SingleCellExperiment
containing multiple clusters & samples across 2 groups.
Helena L Crowell
Crowell, HL, Soneson, C, Germain, P-L, Calini, D, Collin, L, Raposo, C, Malhotra, D & Robinson, MD: On the discovery of population-specific state transitions from multi-sample multi-condition single-cell RNA sequencing data. bioRxiv 713412 (2018). doi: https://doi.org/10.1101/713412
data(sce) library(SingleCellExperiment) # prep. SCE for simulation sce <- prepSim(sce) # simulate data (sim <- simData(sce, n_genes = 100, n_cells = 10, p_dd = c(0.9, 0, 0.1, 0, 0, 0))) # simulation metadata head(gi <- metadata(sim)$gene_info) # should be ~10% DE table(gi$category) # unbalanced sample sizes sim <- simData(sce, n_genes = 10, n_cells = 100, probs = list(NULL, c(0.25, 0.75), NULL)) table(sim$sample_id) # one group only sim <- simData(sce, n_genes = 10, n_cells = 100, probs = list(NULL, NULL, c(1, 0))) levels(sim$group_id)