To validate our Retention Time (RT) prediction in this vignette file, we compare the predicted hydrophobicity value using the ssrc
method Krokhin et al. (2004) implemented in the
protViz package Panse and Grossmann (2019).
The following code snippet performs the comparison on the F255744 data. The file contains amino acid sequences representing the designed flycodes.
library(NestLink)
# load(url("http://fgcz-ms.uzh.ch/~cpanse/p1875/F255744.RData"))
# F255744 <- as.data.frame.mascot(F255744)
# now available through ExperimentHub
library(ExperimentHub)
eh <- ExperimentHub();
load(query(eh, c("NestLink", "F255744.RData"))[[1]])
## see ?NestLink and browseVignettes('NestLink') for documentation
## loading from cache
.ssrc.mascot(F255744, scores = c(10, 20, 40, 50),
pch = 16,
col = rgb(0.1,0.1,0.1,
alpha = 0.1)
)
## [[1]]
##
## Call:
## lm(formula = xx.ssrc ~ xx$RTINSECONDS)
##
## Residuals:
## Min 1Q Median 3Q Max
## -38.954 -2.248 0.015 2.228 71.167
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.580e+00 2.030e-01 -27.48 <2e-16 ***
## xx$RTINSECONDS 8.849e-03 7.434e-05 119.04 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.884 on 12295 degrees of freedom
## Multiple R-squared: 0.5354, Adjusted R-squared: 0.5354
## F-statistic: 1.417e+04 on 1 and 12295 DF, p-value: < 2.2e-16
##
##
## [[2]]
##
## Call:
## lm(formula = xx.ssrc ~ xx$RTINSECONDS)
##
## Residuals:
## Min 1Q Median 3Q Max
## -37.387 -2.040 -0.042 1.930 46.035
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -6.976e+00 1.621e-01 -43.03 <2e-16 ***
## xx$RTINSECONDS 9.447e-03 6.018e-05 156.99 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.12 on 9835 degrees of freedom
## Multiple R-squared: 0.7148, Adjusted R-squared: 0.7147
## F-statistic: 2.464e+04 on 1 and 9835 DF, p-value: < 2.2e-16
##
##
## [[3]]
##
## Call:
## lm(formula = xx.ssrc ~ xx$RTINSECONDS)
##
## Residuals:
## Min 1Q Median 3Q Max
## -15.260 -1.963 -0.114 1.735 45.342
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -7.690e+00 1.784e-01 -43.11 <2e-16 ***
## xx$RTINSECONDS 9.781e-03 6.724e-05 145.46 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.506 on 5574 degrees of freedom
## Multiple R-squared: 0.7915, Adjusted R-squared: 0.7915
## F-statistic: 2.116e+04 on 1 and 5574 DF, p-value: < 2.2e-16
##
##
## [[4]]
##
## Call:
## lm(formula = xx.ssrc ~ xx$RTINSECONDS)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.570 -2.019 -0.142 1.754 45.200
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -7.827e+00 2.173e-01 -36.02 <2e-16 ***
## xx$RTINSECONDS 9.848e-03 8.271e-05 119.06 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.579 on 3650 degrees of freedom
## Multiple R-squared: 0.7952, Adjusted R-squared: 0.7952
## F-statistic: 1.418e+04 on 1 and 3650 DF, p-value: < 2.2e-16
Here is the output of the sessionInfo()
command.
## R Under development (unstable) (2024-10-21 r87258)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.21-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] scales_1.3.0 ggplot2_3.5.1
## [3] NestLink_1.23.0 ShortRead_1.65.0
## [5] GenomicAlignments_1.43.0 SummarizedExperiment_1.37.0
## [7] Biobase_2.67.0 MatrixGenerics_1.19.0
## [9] matrixStats_1.4.1 Rsamtools_2.23.0
## [11] GenomicRanges_1.59.0 BiocParallel_1.41.0
## [13] protViz_0.7.9 gplots_3.2.0
## [15] Biostrings_2.75.0 GenomeInfoDb_1.43.0
## [17] XVector_0.47.0 IRanges_2.41.0
## [19] S4Vectors_0.45.0 ExperimentHub_2.15.0
## [21] AnnotationHub_3.15.0 BiocFileCache_2.15.0
## [23] dbplyr_2.5.0 BiocGenerics_0.53.1
## [25] generics_0.1.3 BiocStyle_2.35.0
##
## loaded via a namespace (and not attached):
## [1] DBI_1.2.3 bitops_1.0-9 deldir_2.0-4
## [4] rlang_1.1.4 magrittr_2.0.3 compiler_4.5.0
## [7] RSQLite_2.3.7 mgcv_1.9-1 png_0.1-8
## [10] vctrs_0.6.5 pwalign_1.3.0 pkgconfig_2.0.3
## [13] crayon_1.5.3 fastmap_1.2.0 magick_2.8.5
## [16] labeling_0.4.3 caTools_1.18.3 utf8_1.2.4
## [19] rmarkdown_2.29 UCSC.utils_1.3.0 tinytex_0.54
## [22] purrr_1.0.2 bit_4.5.0 xfun_0.49
## [25] zlibbioc_1.53.0 cachem_1.1.0 jsonlite_1.8.9
## [28] blob_1.2.4 highr_0.11 DelayedArray_0.33.1
## [31] jpeg_0.1-10 parallel_4.5.0 R6_2.5.1
## [34] bslib_0.8.0 RColorBrewer_1.1-3 jquerylib_0.1.4
## [37] Rcpp_1.0.13-1 bookdown_0.41 knitr_1.48
## [40] splines_4.5.0 Matrix_1.7-1 tidyselect_1.2.1
## [43] abind_1.4-8 yaml_2.3.10 codetools_0.2-20
## [46] hwriter_1.3.2.1 curl_5.2.3 lattice_0.22-6
## [49] tibble_3.2.1 withr_3.0.2 KEGGREST_1.47.0
## [52] evaluate_1.0.1 pillar_1.9.0 BiocManager_1.30.25
## [55] filelock_1.0.3 KernSmooth_2.23-24 BiocVersion_3.21.1
## [58] munsell_0.5.1 gtools_3.9.5 glue_1.8.0
## [61] tools_4.5.0 interp_1.1-6 grid_4.5.0
## [64] latticeExtra_0.6-30 AnnotationDbi_1.69.0 colorspace_2.1-1
## [67] nlme_3.1-166 GenomeInfoDbData_1.2.13 cli_3.6.3
## [70] rappdirs_0.3.3 fansi_1.0.6 S4Arrays_1.7.1
## [73] dplyr_1.1.4 gtable_0.3.6 sass_0.4.9
## [76] digest_0.6.37 SparseArray_1.7.0 farver_2.1.2
## [79] memoise_2.0.1 htmltools_0.5.8.1 lifecycle_1.0.4
## [82] httr_1.4.7 mime_0.12 bit64_4.5.2