msaConservationScore {msa}R Documentation

Computation of Conservation Scores from Multiple Alignment

Description

This method computes a vector of conservation scores from a multiple alignment or a previously computed consensus matrix.

Usage

## S4 method for signature 'matrix'
msaConservationScore(x, substitutionMatrix, gapVsGap=NULL,
    ...)
## S4 method for signature 'MultipleAlignment'
msaConservationScore(x, ...)

Arguments

x

an object of class MultipleAlignment (which includes objects of classes MsaAAMultipleAlignment, MsaDNAMultipleAlignment, and MsaRNAMultipleAlignment) or a previously computed consensus matrix (see details below).

substitutionMatrix

substitution matrix (see details below).

gapVsGap

score to use for aligning gaps versus gaps (see details below).

...

when the method is called for a MultipleAlignment object, the consensus matrix is computed and, including all further arguments, passed on to the msaConservationScore method with signature matrix. This method passes all further arguments on to the msaConsensusSequence method to customize the way the consensus sequence is computed.

Details

The method takes a MultipleAlignment object or a previously computed consensus matrix and computes the sum of pairwise scores for all positions of the alignment. For computing these scores, it is compulsory to specify a substitution/scoring matrix. This matrix must be provided as a matrix object. This can either be one of the ready-made matrices provided by the Biostrings package (e.g. BLOSUM62) or any other hand-crafted matrix. In the latter case, the following restrictions apply:

So, nucleotide substitution matrices created by nucleotideSubstitutionMatrix can be used for multiple alignments of nucleotide sequences, but must be completed with gap penalty rows and columns (see example below).

If the consensus matrix of a multiple alignment of nucleotide sequences contains rows labeled ‘+’ and/or ‘.’, these rows are ignored.

The parameter gapVsGap can be used to control how pairs of gaps are scored. If gapVsGap=NULL (default), the corresponding diagonal entry of the substitution matrix is used as is. In the BLOSUM matrices, this is usually a positive value, which may not make sense under all circumstances. Therefore, the parameter gapVsGap can be set to an alternative value, e.g. 0 for ignoring gap-gap pairs.

The method, in any case, returns a vector of scores that is as long as the alignment/consensus matrix has columns. The names of the vector entries are the corresponding positions of the consensus sequence of the alignment. How this consensus sequence is computed, can be controlled with additional arguments that are passed on to the msaConsensusSequence method.

Value

The function returns a vector as long as the alignment/consensus matrix has columns. The vector is named with the consensus sequence (see details above).

Author(s)

Ulrich Bodenhofer <msa@bioinf.jku.at>

References

http://www.bioinf.jku.at/software/msa

U. Bodenhofer, E. Bonatesta, C. Horejs-Kainrath, and S. Hochreiter (2015). msa: an R package for multiple sequence alignment. Bioinformatics 31(24):3997-3999. DOI: 10.1093/bioinformatics/btv494.

See Also

msa, MsaAAMultipleAlignment, MsaDNAMultipleAlignment, MsaRNAMultipleAlignment, MsaMetaData, MultipleAlignment, msaConsensusSequence

Examples

## read sequences
filepath <- system.file("examples", "HemoglobinAA.fasta", package="msa")
mySeqs <- readAAStringSet(filepath)

## perform multiple alignment
myAlignment <- msa(mySeqs)

## compute consensus scores using the BLOSUM62 matrix
data(BLOSUM62)
msaConservationScore(myAlignment, BLOSUM62)

## compute consensus scores using the BLOSUM62 matrix
## without scoring gap-gap pairs and using a different consensus sequence
msaConservationScore(myAlignment, BLOSUM62, gapVsGap=0,
                     type="upperlower")

## compute a consensus matrix first
conMat <- consensusMatrix(myAlignment)
data(PAM250)
msaConservationScore(conMat, PAM250, gapVsGap=0)


## DNA example
filepath <- system.file("examples", "exampleDNA.fasta", package="msa")
mySeqs <- readDNAStringSet(filepath)

## perform multiple alignment
myAlignment <- msa(mySeqs)

## create substitution matrix with gap penalty -8
mat <- nucleotideSubstitutionMatrix(4, -1)
mat <- cbind(rbind(mat, "-"=-8), "-"=-8)

## compute consensus scores using this matrix
msaConservationScore(myAlignment, mat, gapVsGap=0)

[Package msa version 1.22.0 Index]