DigestDNA {DECIPHER} | R Documentation |
Restriction enzymes can be used to cut double-stranded DNA into fragments at specific cut sites. DigestDNA
performs an in-silico restriction digest of the input DNA sequence(s) given one or more restriction sites.
DigestDNA(sites, myDNAStringSet, type = "fragments", strand = "both")
sites |
A character vector of DNA recognition sequences and their enzymes' corresponding cut site(s). |
myDNAStringSet |
A |
type |
Character string indicating the type of results desired. This should be (an abbreviation of) either |
strand |
Character string indicating the strand(s) to cut. This should be (an abbreviation of) one of |
In the context of a restriction digest experiment with a known DNA sequence, it can be useful to predict the expected DNA fragments in-silico. Restriction enzymes make cuts in double-stranded DNA at specific positions near their recognition site. The recognition site may be somewhat ambiguous, as represented by the IUPAC_CODE_MAP
. Cuts that occur at different positions on the top and bottom strands result in sticky-ends, whereas those that occur at the same position result in fragments with blunt-ends. Multiple restriction sites
can be supplied to simultaneously digest the DNA. In this case, sites
for the different restriction enzymes may be overlapping, which could result in multiple close-proximity cuts that would not occur experimentally. Also, note that cut sites will not be matched to non-DNA_BASES
in myDNAStringSet
.
DigestDNA
can return two type
s of results: cut positions
or the resulting DNA fragments
corresponding to the top
, bottom
, or both
strands. If type
is "positions"
then the output is a list with the cut location(s) in each sequence in myDNAStringSet
. The cut location is defined as the position after the cut relative to the 5'-end. For example, a cut at 6
would occur between positions 5 and 6, where the respective strand's 5' nucleotide is defined as position 1.
If type
is "fragments"
(the default), then the result is a DNAStringSetList
. Each element of the list contains the top
and/or bottom
strand fragments after digestion of myDNAStringSet
, or the original sequence if no cuts were made. Sequences are named by whether they originated from the top
or bottom
strand, and list elements are named based on the input DNA sequences. The top
strand is defined by myDNAStringSet
as it is input, whereas the bottom
strand is its reverse complement.
Erik Wright eswright@pitt.edu
DesignSignatures
, RESTRICTION_ENZYMES
# digest hypothetical DNA sequences with BamHI data(RESTRICTION_ENZYMES) site <- RESTRICTION_ENZYMES[c("BamHI")] dna <- DNAStringSet(c("AAGGATCCAA", "GGGATCAT")) dna # top strand reverseComplement(dna) # bottom strand names(dna) <- c("hyp1", "hyp2") d <- DigestDNA(site, dna) d # fragments in a DNAStringSetList unlist(d) # all fragments as one DNAStringSet # Restriction digest of Yeast Chr. 1 with EcoRI and EcoRV data(yeastSEQCHR1) sites <- RESTRICTION_ENZYMES[c("EcoRI", "EcoRV")] seqs <- DigestDNA(sites, yeastSEQCHR1) seqs[[1]] pos <- DigestDNA(sites, yeastSEQCHR1, type="positions") str(pos)