glm_gp_impl {glmGamPoi} | R Documentation |
Internal Function to Fit a Gamma-Poisson GLM
glm_gp_impl( Y, model_matrix, offset = 0, size_factors = c("normed_sum", "deconvolution", "poscounts"), overdispersion = TRUE, overdispersion_shrinkage = TRUE, do_cox_reid_adjustment = TRUE, subsample = FALSE, verbose = FALSE )
Y |
any matrix-like object (e.g. |
model_matrix |
a numeric matrix that specifies the experimental
design. It can be produced using |
offset |
Constant offset in the model in addition to |
size_factors |
in large scale experiments, each sample is typically of different size
(for example different sequencing depths). A size factor is an internal mechanism of GLMs to
correct for this effect. |
overdispersion |
the simplest count model is the Poisson model. However, the Poisson model
assumes that variance = mean. For many applications this is too rigid and the Gamma-Poisson
allows a more flexible mean-variance relation (variance = mean + mean^2 * overdispersion).
Note that |
overdispersion_shrinkage |
the overdispersion can be difficult to estimate with few replicates. To
improve the overdispersion estimates, we can share information across genes and shrink each individual
overdispersion estimate towards a global overdispersion estimate. Empirical studies show however that
the overdispersion varies based on the mean expression level (lower expression level => higher
dispersion). If |
do_cox_reid_adjustment |
the classical maximum likelihood estimator of the |
subsample |
the estimation of the overdispersion is the slowest step when fitting
a Gamma-Poisson GLM. For datasets with many samples, the estimation can be considerably sped up
without loosing much precision by fitting the overdispersion only on a random subset of the samples.
Default: |
verbose |
a boolean that indicates if information about the individual steps are printed
while fitting the GLM. Default: |
a list with four elements
Beta
the coefficient matrix
overdispersion
the vector with the estimated overdispersions
Mu
a matrix with the corresponding means for each gene
and sample
size_factors
a vector with the size factor for each
sample
glm_gp()
and overdispersion_mle()