map_mod_sites {MSnID} | R Documentation |
Given the peptide sequence with modification X.XXXX*XXXX.X and provided protein sequence FASTA, the method maps the location of the modification resulting in {protein ID}-{aa}{aa position}".
map_mod_sites(object, fasta, accession_col = "accession", peptide_mod_col = "peptide_mod", mod_char = "*", site_delimiter = "lower")
object |
An instance of class MSnID. |
fasta |
(AAStringSet object) Protein sequences read from a FASTA file. Names must match protein/accesison IDs in the accesson column of the MSnID object. |
accession_col |
(string) Name of the column with accession/protein IDs in the MSnID object. Default is "accession". |
peptide_mod_col |
(string) Name of the column with modified peptide sequences in the MSnID object. Default is "peptide_mod". |
mod_char |
(string) character that annotates the position of the modification. Default is "*". |
site_delimiter |
(string) either a single character or "lower" (default) meaning it will be the same amino acid symbol, but in lower case |
MSnID object with extra columns regarting the modification mapping.
Most likely, what you need is SiteID
.
PepLoc |
(list of ints) position of the starting amino acid within protein sequence. It is a list, because there may be multiple occurences of the same sequence matching the peptide's sequence. |
PepLocFirst |
(int) position of the first occurence of the matching sequence |
ProtLength |
(int) protein length |
ModShift |
(vector of ints) positions of modified amino acids within peptide |
ModAAs |
(vector of characters) single-letter amino acid codes of the modified residues |
SiteLoc |
(list of vectors of ints) positions of the modified amino acids within protein for each occurence of the peptide |
Site |
(list of vectors of characters) modified sites encoded as amino acid symbol follwed by position for each occurence of the peptide |
SiteCollapsed |
(list of characters)
same as |
SiteCollapsedFirst |
(character)
first element of the |
SiteID |
(character)
accession ID concatenated with |
Vladislav A Petyuk vladislav.petyuk@pnnl.gov
m <- MSnID(".") mzids <- system.file("extdata","phospho.mzid.gz",package="MSnID") m <- read_mzIDs(m, mzids) # to know the present mod masses report_mods(m) # TMT modification m <- add_mod_symbol(m, mod_mass="229.1629", symbol="#") # alkylation m <- add_mod_symbol(m, mod_mass="57.021463735", symbol="^") # phosphorylation m <- add_mod_symbol(m, mod_mass="79.966330925", symbol="*") # show the mapping head(unique(subset(psms(m), select=c("modification", "peptide_mod")))) # read fasta for mapping modifications fst_path <- system.file("extdata","for_phospho.fasta.gz",package="MSnID") library(Biostrings) fst <- readAAStringSet(fst_path) # to ensure names are the same as in accessions(m) names(fst) <- sub("(^[^ ]*) .*$", "\1", names(fst)) # # mapping phosphosites m <- map_mod_sites(m, fst, "accession", "peptide_mod", "*", "lower") head(unique(subset(psms(m), select=c("accession", "peptide_mod", "SiteID")))) # clean-up cache unlink(".Rcache", recursive=TRUE)