getPausingIndices {BRGenomics} | R Documentation |
Pausing index (PI) is calculated for each gene (within matched
promoters.gr
and genebodies.gr
) as promoter-proximal (or pause
region) signal counts divided by genebody signal counts. If
length.normalize = TRUE
(recommended), the signal counts within each
range in promoters.gr
and genebodies.gr
are divided by their
respective range widths (region lengths) before pausing indices are
calculated.
getPausingIndices( dataset.gr, promoters.gr, genebodies.gr, field = "score", length.normalize = TRUE, remove.empty = FALSE, blacklist = NULL, melt = FALSE, region_names = NULL, expand_ranges = FALSE, ncores = getOption("mc.cores", 2L) )
dataset.gr |
A GRanges object in which signal is contained in metadata (typically in the "score" field), or a named list of such GRanges objects. |
promoters.gr |
A GRanges object containing promoter-proximal regions of interest. |
genebodies.gr |
A GRanges object containing genebody regions of interest. |
field |
The metadata field of |
length.normalize |
A logical indicating if signal counts within regions
of interest should be length normalized. The default is |
remove.empty |
A logical indicating if genes without any signal in
|
blacklist |
An optional GRanges object containing regions that should be
excluded from signal counting. If |
melt |
If |
region_names |
If |
expand_ranges |
Logical indicating if ranges in |
ncores |
Multiple cores will only be used if |
A vector parallel to the input genelist, unless remove.empty =
TRUE
, in which case the vector may be shorter. If dataset.gr
is a
list, or if length(field) > 1
, a dataframe is returned, containing a
column for each field. However, if melt = TRUE
, dataframes contain
one column to indicate regions (either by their indices, or by
region_names
, if given), another column to indicate signal, and a
third column containing the sample name (unless dataset.gr
is a
single GRanges object).
Mike DeBerardine
data("PROseq") # load included PROseq data data("txs_dm6_chr4") # load included transcripts #--------------------------------------------------# # Get promoter-proximal and genebody regions #--------------------------------------------------# # genebodies from +300 to 300 bp before the poly-A site gb <- genebodies(txs_dm6_chr4, 300, -300, min.window = 400) # get the transcripts that are large enough (>1kb in size) txs <- subset(txs_dm6_chr4, tx_name %in% gb$tx_name) # for the same transcripts, promoter-proximal region from 0 to +100 pr <- promoters(txs, 0, 100) #--------------------------------------------------# # Calculate pausing indices #--------------------------------------------------# pidx <- getPausingIndices(PROseq, pr, gb) length(txs) length(pidx) head(pidx) #--------------------------------------------------# # Without length normalization #--------------------------------------------------# head( getPausingIndices(PROseq, pr, gb, length.normalize = FALSE) ) #--------------------------------------------------# # Removing empty means the values no longer match the genelist #--------------------------------------------------# pidx_signal <- getPausingIndices(PROseq, pr, gb, remove.empty = TRUE) length(pidx_signal)