qSig {signatureSearch}R Documentation

Helper Function to Construct a qSig Object

Description

It builds a 'qSig' object to store the query signature, reference database and GESS method used for GESS methods

Usage

qSig(query, gess_method, refdb)

Arguments

query

If 'gess_method' is 'CMAP' or 'LINCS', it should be a list with two character vectors named upset and downset for up- and down-regulated gene labels, respectively. The labels should be gene Entrez IDs if the reference database is a pre-built CMAP or LINCS database. If a custom database is used, the labels need to be of the same type as those in the reference database.

If 'gess_method' is 'gCMAP', the query is a matrix with a single column representing gene ranks from a biological state of interest. The corresponding gene labels are stored in the row name slot of the matrix. Instead of ranks one can provide scores (e.g. z-scores). In such a case, the scores will be internally transformed to ranks.

If 'gess_method' is 'Fisher', the query is expected to be a list with two character vectors named upset and downset for up- and down-regulated gene labels, respectively (same as for 'CMAP' or 'LINCS' method). Internally, the up/down gene labels are combined into a single gene set when querying the reference database with the Fisher's exact test. This means the query is performed with an unsigned set. The query can also be a matrix with a single numeric column and the gene labels (e.g. Entrez gene IDs) in the row name slot. The values in this matrix can be z-scores or LFCs. In this case, the actual query gene set is obtained according to upper and lower cutoffs set by the user.

If 'gess_method' is 'Cor', the query is a matrix with a single numeric column and the gene labels in the row name slot. The numeric column can contain z-scores, LFCs, (normalized) gene expression intensity values or read counts.

gess_method

one of 'CMAP', 'LINCS', 'gCMAP', 'Fisher' or 'Cor'

refdb

character(1), can be one of "cmap", "cmap_expr", "lincs", or "lincs_expr" when using the CMAP/LINCS databases from the affiliated signatureSearchData package. With 'cmap' the database contains signatures of LFC scores obtained from DEG analysis routines; with 'cmap_expr' normalized gene expression values; with 'lincs' z-scores obtained from the DEG analysis methods of the LINCS project; and with 'lincs_expr' normalized expression values.

To use a custom signature database, it should be the file path to the HDF5 file generated with the build_custom_db function. Alternatively, a suitable version of the CMAP/LINCS databases can be used. For details on this, please consult the vignette of the signatureSearchData package.

Value

qSig object

See Also

build_custom_db, signatureSearchData

Examples

db_path <- system.file("extdata", "sample_db.h5", 
                       package = "signatureSearch")
## Load sample_db as `SummarizedExperiment` object
library(SummarizedExperiment); library(HDF5Array)
sample_db <- SummarizedExperiment(HDF5Array(db_path, name="assay"))
rownames(sample_db) <- HDF5Array(db_path, name="rownames")
colnames(sample_db) <- HDF5Array(db_path, name="colnames")
## get "vorinostat__SKB__trt_cp" signature drawn from sample databass
query_mat <- as.matrix(assay(sample_db[,"vorinostat__SKB__trt_cp"]))
query = as.numeric(query_mat); names(query) = rownames(query_mat)
upset <- head(names(query[order(-query)]), 150)
downset <- tail(names(query[order(-query)]), 150)
qsig_lincs <- qSig(query=list(upset=upset, downset=downset), 
                   gess_method="LINCS", refdb=db_path)
qsig_gcmap <- qSig(query=query_mat, gess_method="gCMAP", refdb=db_path)

[Package signatureSearch version 1.0.4 Index]