PCAtools: Everything Principal Components Analysis


[Up] [Top]

Documentation for package ‘PCAtools’ version 2.2.0

Help Pages

biplot Draw a bi-plot, comparing 2 selected principal components / eigenvectors.
chooseGavishDonoho Choosing PCs with the Gavish-Donoho method
chooseMarchenkoPastur Choosing PCs with the Marchenko-Pastur limit
eigencorplot Correlate principal components to continuous variable metadata and test significancies of these.
findElbowPoint Find the elbow point in the curve of variance explained by each successive PC. This can be used to determine the number of PCs to retain.
getComponents Return the principal component labels for an object of class 'pca'.
getLoadings Return component loadings for principal components from an object of class 'pca'.
getVars Return the explained variation for each principal component for an object of class 'pca'.
pairsplot Draw multiple bi-plots.
parallelPCA Perform Horn's parallel analysis to choose the number of principal components to retain.
pca Principal Component Analysis (PCA) is a very powerful technique that has wide applicability in data science, bioinformatics, and further afield. It was initially developed to analyse large volumes of data in order to tease out the differences/relationships between the logical entities being analysed. It extracts the fundamental structure of the data without the need to build any model to represent it. This 'summary' of the data is arrived at through a process of reduction that can transform the large number of variables into a lesser number that are uncorrelated (i.e. the ‘principal components'), whilst at the same time being capable of easy interpretation on the original data. PCAtools provides functions for data exploration via PCA, and allows the user to generate publication-ready figures. PCA is performed via BiocSingular - users can also identify optimal number of principal component via different metrics, such as elbow method and Horn's parallel analysis, which has relevance for data reduction in single-cell RNA-seq (scRNA-seq) and high dimensional mass cytometry data.
plotloadings Plot the component loadings for selected principal components / eigenvectors and label variables driving variation along these.
screeplot Draw a SCREE plot, showing the distribution of explained variance across all or select principal components / eigenvectors.