simData {muscat}R Documentation

simData

Description

Simulation of complex scRNA-seq data

Usage

simData(
  x,
  n_genes = 500,
  n_cells = 300,
  probs = NULL,
  p_dd = diag(6)[1, ],
  p_type = 0,
  lfc = 2,
  rel_lfc = NULL
)

Arguments

x

a SingleCellExperiment.

n_genes

# of genes to simulate.

n_cells

# of cells to simulate. Either a single numeric or a range to sample from.

probs

a list of length 3 containing probabilities of a cell belonging to each cluster, sample, and group, respectively. List elements must be NULL (equal probabilities) or numeric values in [0, 1] that sum to 1.

p_dd

numeric vector of length 6. Specifies the probability of a gene being EE, EP, DE, DP, DM, or DB, respectively.

p_type

numeric. Probaility of EE/EP gene being a type-gene. If a gene is of class "type" in a given cluster, a unique mean will be used for that gene in the respective cluster.

lfc

numeric value to use as mean logFC for DE, DP, DM, and DB type of genes.

rel_lfc

numeric vector of relative logFCs for each cluster. Should be of length nlevels(x$cluster_id) with levels(x$cluster_id) as names. Defaults to factor of 1 for all clusters.

Details

simData simulates multiple clusters and samples across 2 experimental conditions from a real scRNA-seq data set.

Value

a SingleCellExperiment containing multiple clusters & samples across 2 groups.

Author(s)

Helena L Crowell

References

Crowell, HL, Soneson, C, Germain, P-L, Calini, D, Collin, L, Raposo, C, Malhotra, D & Robinson, MD: On the discovery of population-specific state transitions from multi-sample multi-condition single-cell RNA sequencing data. bioRxiv 713412 (2018). doi: https://doi.org/10.1101/713412

Examples

data(sce)
library(SingleCellExperiment)

# prep. SCE for simulation
sce <- prepSim(sce)

# simulate data
(sim <- simData(sce,
  n_genes = 100, n_cells = 10,
  p_dd = c(0.9, 0, 0.1, 0, 0, 0)))

# simulation metadata
head(gi <- metadata(sim)$gene_info)

# should be ~10% DE  
table(gi$category)

# unbalanced sample sizes
sim <- simData(sce,
  n_genes = 10, n_cells = 100,
  probs = list(NULL, c(0.25, 0.75), NULL))
table(sim$sample_id)

# one group only
sim <- simData(sce,
  n_genes = 10, n_cells = 100,
  probs = list(NULL, NULL, c(1, 0)))
levels(sim$group_id)
    

[Package muscat version 1.0.1 Index]