scPCA {scPCA} | R Documentation |
Given target and background data frames or matrices,
scPCA
will perform the sparse contrastive principal component
analysis (scPCA) of the target data for a given number of eigenvectors, a
vector of real-valued contrast parameters, and a vector of sparsity inducing
penalty terms.
If instead you wish to perform contrastive principal component analysis
(cPCA), set the penalties
argument to 0
. So long as the
n_centers
parameter is larger than one, the automated hyperparameter
tuning heuristic described in Boileau et al. (2020-03) is
used. Otherwise, the semi-automated approach of
Abid et al. (2018) is used to select the
appropriate hyperparameter.
scPCA( target, background, center = TRUE, scale = FALSE, n_eigen = 2, cv = NULL, alg = c("iterative", "var_proj", "rand_var_proj"), contrasts = exp(seq(log(0.1), log(1000), length.out = 40)), penalties = seq(0.05, 1, length.out = 20), clust_method = c("kmeans", "pam", "hclust"), n_centers = NULL, max_iter = 10, linkage_method = "complete", n_medoids = 8, parallel = FALSE, clusters = NULL, eigdecomp_tol = 1e-10, eigdecomp_iter = 1000 )
target |
The target (experimental) data set, in a standard format such
as a |
background |
The background data set, in a standard format such as a
|
center |
A |
scale |
A |
n_eigen |
A |
cv |
A |
alg |
A |
contrasts |
A |
penalties |
A |
clust_method |
A |
n_centers |
A |
max_iter |
A |
linkage_method |
A |
n_medoids |
A |
parallel |
A |
clusters |
A |
eigdecomp_tol |
A |
eigdecomp_iter |
A |
A list containing the following components:
rotation
: The matrix of variable loadings if n_centers
is larger than one. Otherwise, a list of rotation matrices is returned,
one for each medoid. The number of medoids is specified by
n_medoids
.
x
: The rotated data, centred and scaled if requested,
multiplied by the rotation matrix if n_centers
is larger than
one. Otherwise, a list of rotated data matrices is returned, one for
each medoid. The number of medoids is specified by n_medoids
.
contrast: The optimal contrastive parameter.
penalty: The optimal L1 penalty term.
center: A logical indicating whether the target dataset was centered.
scale: A logical indicating whether the target dataset was scaled.
Abid A, Zhang MJ, Bagaria VK, Zou J (2018).
“Exploring patterns enriched in a dataset with contrastive principal component analysis.”
Nature communications, 9.
Boileau P, Hejazi NS, Dudoit S (2020-03).
“Exploring High-Dimensional Biological Data with Sparse Contrastive Principal Component Analysis.”
Bioinformatics.
doi: 10.1093/bioinformatics/btaa176, btaa176, https://academic.oup.com/bioinformatics/article-pdf/doi/10.1093/bioinformatics/btaa176/32914142/btaa176.pdf.
Erichson NB, Zeng P, Manohar K, Brunton SL, Kutz JN, Aravkin AY (2018).
“Sparse Principal Component Analysis via Variable Projection.”
ArXiv, abs/1804.00341.
Zou H, Hastie T, Tibshirani R (2006).
“Sparse principal component analysis.”
Journal of computational and graphical statistics, 15, 265–286.
# perform cPCA on the simulated data set scPCA( target = toy_df[, 1:30], background = background_df, contrasts = exp(seq(log(0.1), log(100), length.out = 5)), penalties = 0, n_centers = 4 ) # perform scPCA on the simulated data set scPCA( target = toy_df[, 1:30], background = background_df, contrasts = exp(seq(log(0.1), log(100), length.out = 5)), penalties = seq(0.1, 1, length.out = 3), n_centers = 4 ) # perform cPCA on the simulated data set with known clusters scPCA( target = toy_df[, 1:30], background = background_df, contrasts = exp(seq(log(0.1), log(100), length.out = 5)), penalties = 0, clusters = toy_df[, 31] ) # cPCA as implemented in Abid et al. scPCA( target = toy_df[, 1:30], background = background_df, contrasts = exp(seq(log(0.1), log(100), length.out = 10)), penalties = 0, n_centers = 1 )