BSFDataSet {BindingSiteFinder}R Documentation

BSFDataSet object and constructors

Description

BSFDataSet contains the class GenomicRanges, which is used to store input ranges. Alongside with the iCLIP signal in list structure and additional meta data as data.frame.

Usage

BSFDataSet(ranges, meta, signal, forceEqualNames = TRUE)

BSFDataSet(ranges, meta, signal, forceEqualNames = TRUE)

BSFDataSetFromBigWig(ranges, meta, silent = FALSE)

Arguments

ranges

a GenomicRanges with the desired ranges to process. The strand slot must be either + or -.

meta

a data.frame with at least two columns. The first column should be a unique numeric id. The second column holds sample type information, such as the condition.

signal

a list with the two entries 'signalPlus' and 'signalMinus', following a special representation of SimpleRleList for counts per replicates (see details for more information).

forceEqualNames

to maintain the integrity of chromosome names (TRUE/ FALSE). The option ensures that chromosome names present in the GRanges are also all present in the signal list and vice versa. Chromosomes names present in only the signal list or the ranges are removed.

silent

suppress loading message (TRUE/ FALSE)

Details

The ranges are enforced to have to have a "+" or "-" strand annotation,"*" is not allowed. They are expected to be of the same width and a warning is thrown otherwise.

The meta information is stored as data.frame with at least two required columns, 'id' and 'condition'. They are used to build the unique identifier for each replicate split by '_' (eg. id = 1 and condition = WT will result in 1_WT).

The meta data needs to have the additional columns 'clPlus' and 'clMinus' to be present if BSFDataSetFromBigWig is called. It is used to provide the location to the iCLIP coverage files to the import function. On object initialization these files are loaded and internally represented in the signal slot of the object (see BSFDataSet).

The iCLIP signal is stored in a special list structure. At the lowest level crosslink counts per nucleotide are stored as Rle per chromosome summarized as a SimpleRleList. Such a list exits for each replicate and must be named by the replicate identifier (eg. 1_WT). Therefore this list contains always exactly the same number of entries as the number of replicates in the dataset. Since we handle strands initially seperated from each other this list must be given twice, once for each strand. The strand specific entries must be named 'signalPlus' and 'signalMinus'.

Value

A BSFDataSet object.

Examples


# load data
files <- system.file("extdata", package="BindingSiteFinder")
load(list.files(files, pattern = ".rda$", full.names = TRUE))
rng = getRanges(bds)
sgn = getSignal(bds)
mta = getMeta(bds)
bdsNew = BSFDataSet(ranges = rng, signal = sgn, meta = mta)


[Package BindingSiteFinder version 1.0.0 Index]