motifScanHits {seqPattern} | R Documentation |
Finds positions of sequence motif hits above a specified threshold in a list of
sequences of the same length ordered by a provided index. Motif is specified by
a position weight matrix (PWM) that contains estimated probability of base b at
position i and is usually constructed via call to PWM
function.
Position of each motif hit is specified in two-dimensional matrix, i.e.
the first coordinate provides the ordinal number of the sequence and the second
coordinate gives the position within the sequence where the motif occurs.
motifScanHits(regionsSeq, motifPWM, minScore = "80%", seqOrder = c(1:length(regionsSeq)))
regionsSeq |
A |
motifPWM |
A numeric matrix representing the Position Weight Matrix (PWM), such as
returned by |
minScore |
The minimum score for counting a motif hit. Can be given as a character
string containing a percentage (e.g. |
seqOrder |
Integer vector specifying the order of the provided input sequences.
Must have the same length as the number of sequences in the
|
This function uses the matchPWM
function to find matches to
given motif in a set of input sequences. Only matches above specified
minScore
are considered as hits. Input sequences must all be of the
same length and are ordered according to the index provided in the
seqOrder
argument, creating a n * m
matrix, where n
is
the number of sequences and m
is the length of the sequences.
Positions of motif hits in the resulting matrix are returned as
two-dimensional coordinates.
The function returns a data.frame
with positions of the motif hits in
the set of input sequences. The input sequences of the same length are
sorted according to the index in seqOrder
argument and the positions
of motif hits in the resulting n * m
matrix (where n
is the
number of sequences and m
is the length of the sequence) are
provided. The sequence
column in the data.frame provides the ordinal
number of the sequence in the ordered list of sequences and the
position
column provides the start position of the motif hit within
that sequence.
Vanja Haberle
plotMotifDensityMap
getPatternOccurrenceList
library(GenomicRanges) load(system.file("data", "zebrafishPromoters.RData", package="seqPattern")) promoterWidth <- elementMetadata(zebrafishPromoters)$interquantileWidth load(system.file("data", "TBPpwm.RData", package="seqPattern")) motifOccurrence <- motifScanHits(regionsSeq = zebrafishPromoters, motifPWM = TBPpwm, minScore = "85%", seqOrder = order(promoterWidth)) head(motifOccurrence)