dotproduct {MsCoreUtils}R Documentation

Calculate the normalized dot product

Description

Calculate the normalized dot product (NDP).dotproduct returns a numeric value ranging between 0 and 1, where 0 indicates no similarity between the two MS/MS features, while 1 indicates that the MS/MS features are identical.

Usage

dotproduct(x, y, m = 0.5, n = 0)

Arguments

x

matrix with two column where one contains m/z values (column "mz") and the second corresponding intensity values (column "intensity")

y

matrix with two column where one contains m/z values (column "mz") and the second corresponding intensity values (column "intensity")

m

numeric(1), exponent for peak intensity-based weights

n

numeric(1), exponent for m/z-based weights

Details

Each row in x corresponds to the respective row in y, i.e. the peaks (entries "mz") per spectrum have to match.

m and n are weights given on the peak intensity and the m/z values respectively. As default (m = 0.5), the square root of the intensity values are taken to calculate weights. With increasing values for m, high intensity values become more important for the similarity calculation, i.e. the differences between intensities will be aggravated. With increasing values for n, high m/z values will be taken more into account for similarity calculation. Especially when working with small molecules, a value n > 0 can be set, to give a weight on the m/z values to accommodate that shared fragments with higher m/z are less likely and will mean that molecules might be more similar. If n != 0, a warning will be raised if the corresponding m/z values are not identical, since small differences in m/z values will distort the similarity values with increasing n. If m=0 or n=0, intensity values or m/z values, respectively, are not taken into account.

The normalized dot product is calculated according to:

NDP = ∑(W_{S1,i} * W_{S2,i})^2 / (∑(W_{S1,i}^2) * ∑(W_{S2,i}^2))

, with W = [peak intensity]^m * [m/z]^n. For further information on normalized dot product see for example Li et al. (2015). Prior to calculating W_{S1} or W_{S2}, all intensity values are divided by the maximum intensity value and multiplied by 100.

Value

numeric(1), dotproduct returns a numeric similarity coefficient between 0 and 1.

Author(s)

Thomas Naake, thomasnaake@googlemail.com

References

Li et al. (2015): Navigating natural variation in herbivory-induced secondary metabolism in coyote tobacco populations using MS/MS structural analysis. PNAS, E4147–E4155, doi: 10.1073/pnas.1503106112.

Examples

x <- matrix(c(c(100.001, 100.002, NA, 300.01, 300.02, NA),
        c(2, 1.5, 0, 1.2, 0.9, 0)), ncol = 2,)
y <- matrix(c(c(100.0, NA, 200.0, 300.002, 300.025, 300.0255),
        c(2, 0, 3, 1, 4, 0.4)), ncol = 2)
colnames(x) <- colnames(y) <- c("mz", "intensity")
dotproduct(x, y, m = 0.5, n = 0)

[Package MsCoreUtils version 1.0.0 Index]