bootstrap-signal-by-position {BRGenomics} | R Documentation |
These functions perform bootstrap subsampling of mean readcounts at different
positions within regions of interest (metaSubsample
), or, in the more
general case of metaSubsampleMatrix
, column means of a matrix are
bootstrapped by sampling the rows. Mean signal counts can be calculated at
base-pair resolution, or over larger bins.
metaSubsample( dataset.gr, regions.gr, binsize = 1, first.output.xval = 1, sample.name = deparse(substitute(dataset.gr)), n.iter = 1000, prop.sample = 0.1, lower = 0.125, upper = 0.875, field = "score", NF = NULL, remove.empty = FALSE, blacklist = NULL, zero_blacklisted = FALSE, expand_ranges = FALSE, ncores = getOption("mc.cores", 2L) ) metaSubsampleMatrix( counts.mat, binsize = 1, first.output.xval = 1, sample.name = NULL, n.iter = 1000, prop.sample = 0.1, lower = 0.125, upper = 0.875, NF = 1, remove.empty = FALSE, ncores = getOption("mc.cores", 2L) )
dataset.gr |
A GRanges object in which signal is contained in metadata
(typically in the |
regions.gr |
A GRanges object containing intervals over which to metaplot. All ranges must have the same width. |
binsize |
The size of bin (in basepairs, or number of columns for
|
first.output.xval |
The relative start position of the first bin, e.g.
if |
sample.name |
Defaults to the name of the input dataset. This is
included in the output as a convenience, as it allows row-binding outputs
from different samples. If |
n.iter |
Number of random subsampling iterations to perform. Default is
|
prop.sample |
The proportion of the ranges in |
lower, upper |
The lower and upper quantiles of subsampled signal means
to return. The defaults, |
field |
One or more metadata fields of |
NF |
An optional normalization factor by which to multiply the counts.
If given, |
remove.empty |
A logical indicating whether regions
( |
blacklist |
An optional GRanges object containing regions that should be excluded from signal counting. |
zero_blacklisted |
When set to |
expand_ranges |
Logical indicating if ranges in |
ncores |
Number of cores to use for computations. |
counts.mat |
A matrix over which to bootstrap column means by subsampling its rows. Typically, a matrix of readcounts with rows for genes and columns for positions within those genes. |
A dataframe containing x-values, means, lower quantiles, upper quantiles, and the sample name (as a convenience for row-binding multiple of these dataframes). If a list of GRanges is given as input, or if multiple fields are given, a single, combined dataframe is returned containing data for all fields/datasets.
Mike DeBerardine
data("PROseq") # import included PROseq data data("txs_dm6_chr4") # import included transcripts # for each transcript, use promoter-proximal region from TSS to +100 pr <- promoters(txs_dm6_chr4, 0, 100) #--------------------------------------------------# # Bootstrap average signal in each 5 bp bin across all transcripts, # and get confidence bands for middle 30% of bootstrapped means #--------------------------------------------------# set.seed(11) df <- metaSubsample(PROseq, pr, binsize = 5, lower = 0.35, upper = 0.65, ncores = 1) df[1:10, ] #--------------------------------------------------# # Plot bootstrapped means with confidence intervals #--------------------------------------------------# plot(mean ~ x, df, type = "l", main = "PROseq Signal", ylab = "Mean + 30% CI", xlab = "Distance from TSS") polygon(c(df$x, rev(df$x)), c(df$lower, rev(df$upper)), col = adjustcolor("black", 0.1), border = FALSE) #==================================================# # Using a matrix as input #==================================================# # generate a matrix of counts in each region countsmat <- getCountsByPositions(PROseq, pr) dim(countsmat) #--------------------------------------------------# # bootstrap average signal in 10 bp bins across all transcripts #--------------------------------------------------# set.seed(11) df <- metaSubsampleMatrix(countsmat, binsize = 10, sample.name = "PROseq", ncores = 1) df[1:10, ] #--------------------------------------------------# # the same, using a normalization factor, and changing the x-values #--------------------------------------------------# set.seed(11) df <- metaSubsampleMatrix(countsmat, binsize = 10, first.output.xval = 0, NF = 0.75, sample.name = "PROseq", ncores = 1) df[1:10, ]