XNAMatchPDict {XNAString}R Documentation

Find set of patterns in reference sequence

Description

This is function finding all the occurrences of a given set of patterns (typically short) in a (typically long) reference sequence

Usage

XNAMatchPDict(
  pdict,
  subject,
  max.mismatch = 0,
  min.mismatch = 0,
  with.indels = FALSE,
  fixed = TRUE,
  algorithm = "auto",
  verbose = FALSE
)

## S4 method for signature 'XNAString,character'
XNAMatchPDict(
  pdict,
  subject,
  max.mismatch = 0,
  min.mismatch = 0,
  with.indels = FALSE,
  fixed = TRUE,
  algorithm = "auto",
  verbose = FALSE
)

## S4 method for signature 'XNAString,XString'
XNAMatchPDict(
  pdict,
  subject,
  max.mismatch = 0,
  min.mismatch = 0,
  with.indels = FALSE,
  fixed = TRUE,
  algorithm = "auto",
  verbose = FALSE
)

Arguments

pdict

XNAString object, target slot taken as pdict object from Biostrings

subject

string containing sequence

max.mismatch

The maximum number of mismatching letters allowed. If non-zero, an algorithm that supports inexact matching is used.

min.mismatch

The minimum number of mismatching letters allowed. If non-zero, an algorithm that supports inexact matching is used.

with.indels

If TRUE then indels are allowed. In that case, min.mismatch must be 0 and max.mismatch is interpreted as the maximum "edit distance" allowed between the pattern and a match. Note that in order to avoid pollution by redundant matches, only the "best local matches" are returned. Roughly speaking, a "best local match" is a match that is locally both the closest (to the pattern P) and the shortest.

fixed

If TRUE (the default), an IUPAC ambiguity code in the pattern can only match the same code in the subject, and vice versa. If FALSE, an IUPAC ambiguity code in the pattern can match any letter in the subject that is associated with the code, and vice versa.

algorithm

One of the following: "auto", "naive-exact", "naive-inexact", "boyer-moore", "shift-or" or "indels".

verbose

TRUE or FALSE.

Value

an MIndex object of length M, and countPDict an integer vector of length M.

Examples

 
s2 <-
XNAString::XNAString(
 base = "GCGGAGAGAGCACAGATACA",
 sugar = "FODDDDDDDDDDDDDDDDDD",
 target = Biostrings::DNAStringSet(c(
   "GGCGGAGAGAGCACAGATACA", "GGCGGAGAGAGCACAGATACA"
 ))
)
o <- XNAString::XNAMatchPDict(
 s2,
 "GGCGGAGAGAGCACAGATACAGGGGCGGAGAGAGCACAGATACACGGAGAGAGCACAGATACA"
)

[Package XNAString version 1.0.2 Index]