odseq {odseq} | R Documentation |
This function will first compute a distance metric among every sequence in the multiple alignment. Then it will bootstrap an average score of these distance to provide information on the distribution of scores, which is used to distinguish outlier sequences with a certain threshold
odseq(msa_object, distance_metric = "linear", B = 100, threshold = 0.025)
msa_object |
An object of formal class |
distance_metric |
A string indicating the type of distance metric to be computed. Either |
B |
Integer indicating the number of bootstrap replicates to be run. The higher the more robust the detection should be. |
threshold |
Float indicating the probability to be left at the right of the bootstrap scores distribution when computing outliers. This parameter may need some tuning depending on each specific problem |
Returns a logical vector, where TRUE
indicates an outlier.
José Jiménez <jose@jimenezluna.com>
[1] OD-seq: outlier detection in multiple sequence alignments. Peter Jehl, Fabian Sievers and Desmond G. Higgins. BMC Bioinformatics. 2015.
library(msa) data(seqs) al <- msa(seqs) odseq(al, distance_metric = "affine", B = 1000, threshold = 0.025)