msaConservationScore {msa} | R Documentation |
This method computes a vector of conservation scores from a multiple alignment or a previously computed consensus matrix.
## S4 method for signature 'matrix' msaConservationScore(x, substitutionMatrix, gapVsGap=NULL, ...) ## S4 method for signature 'MultipleAlignment' msaConservationScore(x, ...)
x |
an object of class |
substitutionMatrix |
substitution matrix (see details below). |
gapVsGap |
score to use for aligning gaps versus gaps (see details below). |
... |
when the method is called for a
|
The method takes a MultipleAlignment
object or a
previously computed consensus matrix and computes the sum of pairwise
scores for all positions of the alignment. For computing these scores,
it is compulsory to specify a substitution/scoring matrix. This matrix
must be provided as a matrix
object. This can either be
one of the ready-made matrices provided by the Biostrings
package (e.g. BLOSUM62
) or any other hand-crafted
matrix. In the latter case, the following restrictions apply:
The matrix must be quadratic.
For reasonable results, the matrix should be symmetric (note that this is not checked).
Rows and columns must be named and the order of letters/symbols in row names and column names must be identical.
All letters/symbols occurring in the multiple alignment, including gaps ‘-’, must also be found in the row/column names of the substitution matrix. For consistency with the matrices from the Biostrings package, the row/column corresponding to gap penalties may also be labeled ‘*’ instead of ‘-’.
So, nucleotide substitution matrices created by
nucleotideSubstitutionMatrix
can be used for multiple
alignments of nucleotide sequences, but must be
completed with gap penalty rows and columns (see example below).
If the consensus matrix of a multiple alignment of nucleotide sequences contains rows labeled ‘+’ and/or ‘.’, these rows are ignored.
The parameter gapVsGap
can be used to control how
pairs of gaps are scored. If gapVsGap=NULL
(default), the
corresponding diagonal entry of the substitution matrix is used as is.
In the BLOSUM matrices, this is usually a positive value, which may
not make sense under all circumstances. Therefore, the parameter
gapVsGap
can be set to an alternative value, e.g. 0 for
ignoring gap-gap pairs.
The method, in any case, returns a vector of scores that is as long
as the alignment/consensus matrix has columns. The names of the vector
entries are the corresponding positions of the consensus sequence of
the alignment. How this consensus sequence is computed, can be
controlled with additional arguments that are passed on to the
msaConsensusSequence
method.
The function returns a vector as long as the alignment/consensus matrix has columns. The vector is named with the consensus sequence (see details above).
Ulrich Bodenhofer <msa@bioinf.jku.at>
http://www.bioinf.jku.at/software/msa
U. Bodenhofer, E. Bonatesta, C. Horejs-Kainrath, and S. Hochreiter (2015). msa: an R package for multiple sequence alignment. Bioinformatics 31(24):3997-3999. DOI: 10.1093/bioinformatics/btv494.
msa
, MsaAAMultipleAlignment
,
MsaDNAMultipleAlignment
,
MsaRNAMultipleAlignment
,
MsaMetaData
,
MultipleAlignment
,
msaConsensusSequence
## read sequences filepath <- system.file("examples", "HemoglobinAA.fasta", package="msa") mySeqs <- readAAStringSet(filepath) ## perform multiple alignment myAlignment <- msa(mySeqs) ## compute consensus scores using the BLOSUM62 matrix data(BLOSUM62) msaConservationScore(myAlignment, BLOSUM62) ## compute consensus scores using the BLOSUM62 matrix ## without scoring gap-gap pairs and using a different consensus sequence msaConservationScore(myAlignment, BLOSUM62, gapVsGap=0, type="upperlower") ## compute a consensus matrix first conMat <- consensusMatrix(myAlignment) data(PAM250) msaConservationScore(conMat, PAM250, gapVsGap=0) ## DNA example filepath <- system.file("examples", "exampleDNA.fasta", package="msa") mySeqs <- readDNAStringSet(filepath) ## perform multiple alignment myAlignment <- msa(mySeqs) ## create substitution matrix with gap penalty -8 mat <- nucleotideSubstitutionMatrix(4, -1) mat <- cbind(rbind(mat, "-"=-8), "-"=-8) ## compute consensus scores using this matrix msaConservationScore(myAlignment, mat, gapVsGap=0)