SummarizedBenchmark
package is built on two primary classes, the BenchDesign class for organizing the design of a benchmark experiment, and the SummarizedExperiment class for organizing the benchmark results. This vignette first introduces the structure and features of these classes using the example benchmark of p-value correction methods described in the SummarizedBenchark: Introduction vignette. Then, some issues of reproduciblity are discussed, and finally, several features and non-standard uses of the classes are described. SummarizedBenchmark package version: 2.6.0”
SummarizedBenchmark 2.6.0
This vignette assumes that the user is familiar with the basics of the SummarizedBenchmark
framework for benchmarking computational methods described in the SummarizedBenchmark: Introduction vignette (the “introduction”). New users should start from the package description and example presented in the introduction before moving on to more advanced topics, including the class descriptions in this vignette. Here, we describe the anatomy of the BenchDesign and SummarizedBenchmark classes in a little more detail than was presented in the introduction.
The classes are described in more detail using the same example benchmark of methods for multiple comparisons adjustment described in the introduction. Briefly, three methods, the Bonferroni correction, Benjamini-Hochberg step-up procedure, and Storey’s FDR q-value were benchmarked on a collection of 50 simulated two-sample t-tests. The simulated t-test results which serve as inputs to the methods are included with the package in the tdat
data set.
data(tdat)
b <- BenchDesign(data = tdat) %>%
addMethod(label = "bonf", func = p.adjust,
params = rlang::quos(p = pval, method = "bonferroni")) %>%
addMethod(label = "BH",
func = p.adjust,
params = rlang::quos(p = pval, method = "BH")) %>%
addMethod(label = "qv",
func = qvalue::qvalue,
params = rlang::quos(p = pval),
post = function(x) { x$qvalues })
sb <- buildBench(b, truthCols = "H")
The code above is identical to what was used in the introduction to construct the BenchDesign, b
, and the SummarizedBenchmark results, sb
. If any of this is unclear, consider revisting the SummarizedBenchmark: Introduction vignette.
BenchDesign objects are composed of a data set and methods. Formally, the methods are stored in the BenchDesign as a BDMethodList and the data as a BDData object. As would be expected, the BDMethodList is a list (List
) of BDMethod objects, each containing the definition of a method to be compared in the benchmark experiment. The general structure is illustrated in the figure below.
Continuing with the BenchDesign constructed above, we can access the list of methods and the data set stored in the object using the BDMethodList()
and BDData()
functions.
## BenchDesign Method List (BDMethodList) ---------------------
## list of 3 methods
## method: bonf; func: p.adjust
## method: BH; func: p.adjust
## method: qv; func: qvalue::qvalue
## BenchDesign Data (BDData) ----------------------------------
## type: data
## data:
## H test_statistic effect_size pval SE
## 1 1 -3.2083437 -1.17151466 4.872282e-03 0.3651463
## 2 0 0.1692236 0.07321912 8.675080e-01 0.4326768
## 3 1 -5.7077940 -1.81715381 2.061521e-05 0.3183636
## 4 1 -1.9805856 -1.09107836 6.313031e-02 0.5508867
## 5 1 -1.0014358 -0.37726058 3.298895e-01 0.3767197
## 6 1 -0.9190433 -0.47583669 3.702252e-01 0.5177522
## 7 1 -1.3184486 -0.72750221 2.038846e-01 0.5517866
## 8 1 -3.8492346 -1.47787871 1.175116e-03 0.3839409
## 9 1 -3.4836213 -1.35474550 2.651401e-03 0.3888900
## 10 0 0.6238195 0.25470364 5.405719e-01 0.4082970
## ... with 40 more rows.
The list of methods inherits from the SimpleList class, and supports several useful accessor and setter features.
## BenchDesign Method (BDMethod) ------------------------------
## method: p.adjust
## function (p, method = p.adjust.methods, n = length(p)...
## parameters:
## p : pval
## method : "bonferroni"
## post:
## none
## meta:
## none
This interface allows for adding new methods by creating a new named entry in the BDMethodList of the BenchDesign object.
## BenchDesign ------------------------------------------------
## benchmark data:
## type: data
## names: H, test_statistic, effect_size, pval, SE
## benchmark methods:
## method: bonf; func: p.adjust
## method: BH; func: p.adjust
## method: qv; func: qvalue::qvalue
## method: bonf2; func: p.adjust
Methods can also be removed by setting the list value to NULL. An equivalent convenience function, dropMethod()
also exists for removing methods using syntax similar to addMethod()
.
## BenchDesign ------------------------------------------------
## benchmark data:
## type: data
## names: H, test_statistic, effect_size, pval, SE
## benchmark methods:
## method: BH; func: p.adjust
## method: qv; func: qvalue::qvalue
## method: bonf2; func: p.adjust
Each BDMethod object in a BDMethodList encapsulates a method to be benchmarked. The contents of the object are all of the arguments passed through the addMethod()
convenience function used in the example above. New BDMethod objects can be constructed directly using the BDMethod()
constructor, which takes the same input parameters as addMethod()
.
## BenchDesign Method (BDMethod) ------------------------------
## method: p.adjust
## function (p, method = p.adjust.methods, n = length(p)...
## parameters:
## p : pval
## method : "bonferroni"
## post:
## none
## meta:
## none
Directly modifying the BDMethodList object provides an alternative approach (aside form using addMethod()
) to adding methods to a BenchDesign object.
## BenchDesign ------------------------------------------------
## benchmark data:
## type: data
## names: H, test_statistic, effect_size, pval, SE
## benchmark methods:
## method: BH; func: p.adjust
## method: qv; func: qvalue::qvalue
## method: bonf2; func: p.adjust
## method: bonf; func: p.adjust
The BDData object is a simple object which only contains two slots, @type
and @data
. The @type
of a BDData object can be either data
or hash
. If the @type
is data
, the @data
slot is a standard list or data.frame object of raw data. However, if the @type
is hash
, then the @data
slot is just a MD5 hash of some raw data object computed using digest::digest()
. As described next in Section 3, SummarizedBenchmark objects also contain the BenchDesign object used to generate the benchmark results. Often, the original raw data can be large, and saving the full data set as part of the SummarizedBenchmark object can be undesirable and unnecessary. While the raw data set is needed during the benchmark experiment, by default, the BDData is converted to a MD5 hash beforing stored the BenchDesign as part of a SummarizedBenchmark object. Using this approach, it is still possible to confirm whether a certain data set was used for a benchmarking experiment without having to store copies of the raw data with every SummarizedBenchmark.
Returning to the BenchDesign example from above, the BDData object of a BenchDesign can also be similarly extracted and modified. As with BDMethod objects, the data set may be removed by setting the value to NULL.
## BenchDesign ------------------------------------------------
## benchmark data:
## NULL
## benchmark methods:
## method: BH; func: p.adjust
## method: qv; func: qvalue::qvalue
## method: bonf2; func: p.adjust
## method: bonf; func: p.adjust
A new data set can be constructed using the BDData()
function and used to replace the data set in a BenchDesign object.
## BenchDesign ------------------------------------------------
## benchmark data:
## type: data
## names: H, test_statistic, effect_size, pval, SE
## benchmark methods:
## method: BH; func: p.adjust
## method: qv; func: qvalue::qvalue
## method: bonf2; func: p.adjust
## method: bonf; func: p.adjust
The helper function hashBDData()
can be called on either BDData or BenchDesign objects to convert the data to a MD5 hash. By default, this function is called on the BenchDesign object during buildBench()
before storing the object in the resulting SummarizedBenchmark.
## BenchDesign ------------------------------------------------
## benchmark data:
## type: md5hash
## MD5 hash: 09867f2696d3178a8f3bc038917042e1
## benchmark methods:
## method: BH; func: p.adjust
## method: qv; func: qvalue::qvalue
## method: bonf2; func: p.adjust
## method: bonf; func: p.adjust
We drop the "bonf2"
method from the object to return to the same set of methods used to generate the SummarizedBenchmark object.
As described above, the SummarizedBenchmark class builds on the existing SummarizedExperiment class and Bioconductor infrastructure. By doing so, saved results are tightly linked to metadata. Thus, it is possible, for example, to easily subset the results without losing references to the respective metadata. For example, the code below extracts the data for only the first two methods.
## DataFrame with 2 rows and 6 columns
## func.pkg func.pkg.vers func.pkg.manual param.p param.method
## <character> <character> <logical> <character> <character>
## bonf stats 4.0.0 FALSE pval "bonferroni"
## BH stats 4.0.0 FALSE pval "BH"
## session.idx
## <numeric>
## bonf 1
## BH 1
Building on the SummarizedExperiment class, in addition to the slot for performanceMetrics
described above, the SummarizedBenchmark class also includes a slot which stores a copy of the BenchDesign object used to generate the results with buildBench()
. The BenchDesign can be accessed by simply passing the object to BenchDesign()
.
## BenchDesign ------------------------------------------------
## benchmark data:
## type: md5hash
## MD5 hash: 09867f2696d3178a8f3bc038917042e1
## benchmark methods:
## method: bonf; func: p.adjust
## method: BH; func: p.adjust
## method: qv; func: qvalue::qvalue
Notice, however, unlike the original BenchDesign above, the data is stored as a MD5 hash. To prevent bloat when the data set is large, by default, only a MD5 hash computed using digest::digest()
is stored with the benchmarking results. This behavior can be changed by setting keepData = TRUE
when calling buildBench()
, in which case, the complete data set will be stored as part of the BenchDesign and SummarizedBenchmark objects. A simple helper function, compareBDData()
, can be used to verify that while the data type may be different between the data originally used to construct the SummarizedBenchmark and the data stored with the SummarizedBenchmark, the referenced data sets are the same (i.e. have matching MD5 hash values).
## $data
## [1] TRUE
##
## $type
## [1] FALSE
Finally, information about the R session where the methods were executed is stored in the metadata of the SummarizedBenchmark as an entry in a list called sessions
. The package supports rerunning benchmark experiments, and for this purpose, each new session is stored as a new entry in the sessions
list. Since this experiment has only been executed once, the list only has a single entry in the sessions
list. In addition to storing the session info and list of methods evaluated during the session, the list also keeps track of all parameters in the buildBench()
call, i.e. truthCols = "H"
.
## $methods
## [1] "bonf" "BH" "qv"
##
## $results
## $results$bonf
## $results$bonf$default
## [1] "success"
##
##
## $results$BH
## $results$BH$default
## [1] "success"
##
##
## $results$qv
## $results$qv$default
## [1] "success"
##
##
##
## $parameters
## $parameters$truthCols
## [1] "H"
##
## $parameters$ftCols
## NULL
##
## $parameters$sortIDs
## [1] FALSE
##
## $parameters$keepData
## [1] FALSE
##
## $parameters$catchErrors
## [1] TRUE
##
## $parameters$parallel
## [1] FALSE
##
## $parameters$BPPARAM
## class: MulticoreParam
## bpisup: FALSE; bpnworkers: 4; bptasks: 0; bpjobname: BPJOB
## bplog: FALSE; bpthreshold: INFO; bpstopOnError: TRUE
## bpRNGseed: ; bptimeout: 2592000; bpprogressbar: FALSE
## bpexportglobals: TRUE
## bplogdir: NA
## bpresultdir: NA
## cluster type: FORK
##
##
## $sessionInfo
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 4.0.0 (2020-04-24)
## os Ubuntu 18.04.4 LTS
## system x86_64, linux-gnu
## ui X11
## language (EN)
## collate C
## ctype en_US.UTF-8
## tz America/New_York
## date 2020-04-27
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## package * version date lib source
## annotate 1.66.0 2020-04-27 [2] Bioconductor
## AnnotationDbi 1.50.0 2020-04-27 [2] Bioconductor
## AnnotationHub 2.20.0 2020-04-27 [2] Bioconductor
## ashr 2.2-47 2020-02-20 [2] CRAN (R 4.0.0)
## assertthat 0.2.1 2019-03-21 [2] CRAN (R 4.0.0)
## backports 1.1.6 2020-04-05 [2] CRAN (R 4.0.0)
## beeswarm 0.2.3 2016-04-25 [2] CRAN (R 4.0.0)
## Biobase * 2.48.0 2020-04-27 [2] Bioconductor
## BiocFileCache 1.12.0 2020-04-27 [2] Bioconductor
## BiocGenerics * 0.34.0 2020-04-27 [2] Bioconductor
## BiocManager 1.30.10 2019-11-16 [2] CRAN (R 4.0.0)
## BiocNeighbors 1.6.0 2020-04-27 [2] Bioconductor
## BiocParallel * 1.22.0 2020-04-27 [2] Bioconductor
## BiocSingular 1.4.0 2020-04-27 [2] Bioconductor
## BiocStyle * 2.16.0 2020-04-27 [2] Bioconductor
## BiocVersion 3.11.1 2020-04-27 [2] Bioconductor
## bit 1.1-15.2 2020-02-10 [2] CRAN (R 4.0.0)
## bit64 0.9-7 2017-05-08 [2] CRAN (R 4.0.0)
## bitops 1.0-6 2013-08-17 [2] CRAN (R 4.0.0)
## blob 1.2.1 2020-01-20 [2] CRAN (R 4.0.0)
## bookdown 0.18 2020-03-05 [2] CRAN (R 4.0.0)
## checkmate 2.0.0 2020-02-06 [2] CRAN (R 4.0.0)
## cli 2.0.2 2020-02-28 [2] CRAN (R 4.0.0)
## codetools 0.2-16 2018-12-24 [2] CRAN (R 4.0.0)
## colorspace 1.4-1 2019-03-18 [2] CRAN (R 4.0.0)
## crayon * 1.3.4 2017-09-16 [2] CRAN (R 4.0.0)
## curl 4.3 2019-12-02 [2] CRAN (R 4.0.0)
## DBI 1.1.0 2019-12-15 [2] CRAN (R 4.0.0)
## dbplyr 1.4.3 2020-04-19 [2] CRAN (R 4.0.0)
## DelayedArray * 0.14.0 2020-04-27 [2] Bioconductor
## DelayedMatrixStats 1.10.0 2020-04-27 [2] Bioconductor
## DESeq2 * 1.28.0 2020-04-27 [2] Bioconductor
## digest * 0.6.25 2020-02-23 [2] CRAN (R 4.0.0)
## dplyr * 0.8.5 2020-03-07 [2] CRAN (R 4.0.0)
## edgeR * 3.30.0 2020-04-27 [2] Bioconductor
## ellipsis 0.3.0 2019-09-20 [2] CRAN (R 4.0.0)
## evaluate 0.14 2019-05-28 [2] CRAN (R 4.0.0)
## ExperimentHub 1.14.0 2020-04-27 [2] Bioconductor
## fansi 0.4.1 2020-01-08 [2] CRAN (R 4.0.0)
## farver 2.0.3 2020-01-16 [2] CRAN (R 4.0.0)
## fastmap 1.0.1 2019-10-08 [2] CRAN (R 4.0.0)
## fitdistrplus 1.0-14 2019-01-23 [2] CRAN (R 4.0.0)
## genefilter 1.70.0 2020-04-27 [2] Bioconductor
## geneplotter 1.66.0 2020-04-27 [2] Bioconductor
## GenomeInfoDb * 1.24.0 2020-04-27 [2] Bioconductor
## GenomeInfoDbData 1.2.3 2020-04-24 [2] Bioconductor
## GenomicRanges * 1.40.0 2020-04-27 [2] Bioconductor
## ggbeeswarm 0.6.0 2017-08-07 [2] CRAN (R 4.0.0)
## ggplot2 * 3.3.0 2020-03-05 [2] CRAN (R 4.0.0)
## glue 1.4.0 2020-04-03 [2] CRAN (R 4.0.0)
## gridExtra 2.3 2017-09-09 [2] CRAN (R 4.0.0)
## gtable 0.3.0 2019-03-25 [2] CRAN (R 4.0.0)
## htmltools 0.4.0 2019-10-04 [2] CRAN (R 4.0.0)
## httpuv 1.5.2 2019-09-11 [2] CRAN (R 4.0.0)
## httr 1.4.1 2019-08-05 [2] CRAN (R 4.0.0)
## interactiveDisplayBase 1.26.0 2020-04-27 [2] Bioconductor
## invgamma 1.1 2017-05-07 [2] CRAN (R 4.0.0)
## IRanges * 2.22.0 2020-04-27 [2] Bioconductor
## irlba 2.3.3 2019-02-05 [2] CRAN (R 4.0.0)
## knitr 1.28 2020-02-06 [2] CRAN (R 4.0.0)
## labeling 0.3 2014-08-23 [2] CRAN (R 4.0.0)
## later 1.0.0 2019-10-04 [2] CRAN (R 4.0.0)
## lattice 0.20-41 2020-04-02 [2] CRAN (R 4.0.0)
## lifecycle 0.2.0 2020-03-06 [2] CRAN (R 4.0.0)
## limma * 3.44.0 2020-04-27 [2] Bioconductor
## locfit 1.5-9.4 2020-03-25 [2] CRAN (R 4.0.0)
## lsei 1.2-0 2017-10-23 [2] CRAN (R 4.0.0)
## magick 2.3 2020-01-24 [2] CRAN (R 4.0.0)
## magrittr * 1.5 2014-11-22 [2] CRAN (R 4.0.0)
## MASS 7.3-51.6 2020-04-26 [2] CRAN (R 4.0.0)
## Matrix 1.2-18 2019-11-27 [2] CRAN (R 4.0.0)
## matrixStats * 0.56.0 2020-03-13 [2] CRAN (R 4.0.0)
## mclust * 5.4.6 2020-04-11 [2] CRAN (R 4.0.0)
## memoise 1.1.0 2017-04-21 [2] CRAN (R 4.0.0)
## mgcv 1.8-31 2019-11-09 [2] CRAN (R 4.0.0)
## mime 0.9 2020-02-04 [2] CRAN (R 4.0.0)
## mixsqp 0.3-17 2020-01-29 [2] CRAN (R 4.0.0)
## munsell 0.5.0 2018-06-12 [2] CRAN (R 4.0.0)
## nlme 3.1-147 2020-04-13 [2] CRAN (R 4.0.0)
## npsurv 0.4-0 2017-10-14 [2] CRAN (R 4.0.0)
## pillar 1.4.3 2019-12-20 [2] CRAN (R 4.0.0)
## pkgconfig 2.0.3 2019-09-22 [2] CRAN (R 4.0.0)
## plyr 1.8.6 2020-03-03 [2] CRAN (R 4.0.0)
## promises 1.1.0 2019-10-04 [2] CRAN (R 4.0.0)
## purrr 0.3.4 2020-04-17 [2] CRAN (R 4.0.0)
## qvalue 2.20.0 2020-04-27 [2] Bioconductor
## R6 2.4.1 2019-11-12 [2] CRAN (R 4.0.0)
## rappdirs 0.3.1 2016-03-28 [2] CRAN (R 4.0.0)
## RColorBrewer 1.1-2 2014-12-07 [2] CRAN (R 4.0.0)
## Rcpp 1.0.4.6 2020-04-09 [2] CRAN (R 4.0.0)
## RCurl 1.98-1.2 2020-04-18 [2] CRAN (R 4.0.0)
## reshape2 1.4.4 2020-04-09 [2] CRAN (R 4.0.0)
## rlang * 0.4.5 2020-03-01 [2] CRAN (R 4.0.0)
## rmarkdown 2.1 2020-01-20 [2] CRAN (R 4.0.0)
## RSQLite 2.2.0 2020-01-07 [2] CRAN (R 4.0.0)
## rsvd 1.0.3 2020-02-17 [2] CRAN (R 4.0.0)
## S4Vectors * 0.26.0 2020-04-27 [2] Bioconductor
## scales 1.1.0 2019-11-18 [2] CRAN (R 4.0.0)
## scater * 1.16.0 2020-04-27 [2] Bioconductor
## scRNAseq * 2.1.7 2020-04-27 [2] Bioconductor
## sessioninfo * 1.1.1 2018-11-05 [2] CRAN (R 4.0.0)
## shiny 1.4.0.2 2020-03-13 [2] CRAN (R 4.0.0)
## SingleCellExperiment * 1.10.0 2020-04-27 [2] Bioconductor
## splatter * 1.12.0 2020-04-27 [2] Bioconductor
## SQUAREM 2020.2 2020-02-21 [2] CRAN (R 4.0.0)
## stringi 1.4.6 2020-02-17 [2] CRAN (R 4.0.0)
## stringr * 1.4.0 2019-02-10 [2] CRAN (R 4.0.0)
## SummarizedBenchmark * 2.6.0 2020-04-27 [1] Bioconductor
## SummarizedExperiment * 1.18.0 2020-04-27 [2] Bioconductor
## survival 3.1-12 2020-04-10 [2] CRAN (R 4.0.0)
## tibble * 3.0.1 2020-04-20 [2] CRAN (R 4.0.0)
## tidyr * 1.0.2 2020-01-24 [2] CRAN (R 4.0.0)
## tidyselect 1.0.0 2020-01-27 [2] CRAN (R 4.0.0)
## truncnorm 1.0-8 2018-02-27 [2] CRAN (R 4.0.0)
## tximport * 1.16.0 2020-04-27 [2] Bioconductor
## UpSetR * 1.4.0 2019-05-22 [2] CRAN (R 4.0.0)
## utf8 1.1.4 2018-05-24 [2] CRAN (R 4.0.0)
## vctrs 0.2.4 2020-03-10 [2] CRAN (R 4.0.0)
## vipor 0.4.5 2017-03-22 [2] CRAN (R 4.0.0)
## viridis 0.5.1 2018-03-29 [2] CRAN (R 4.0.0)
## viridisLite 0.3.0 2018-02-01 [2] CRAN (R 4.0.0)
## withr 2.2.0 2020-04-20 [2] CRAN (R 4.0.0)
## xfun 0.13 2020-04-13 [2] CRAN (R 4.0.0)
## XML 3.99-0.3 2020-01-20 [2] CRAN (R 4.0.0)
## xtable 1.8-4 2019-04-21 [2] CRAN (R 4.0.0)
## XVector 0.28.0 2020-04-27 [2] Bioconductor
## yaml 2.2.1 2020-02-01 [2] CRAN (R 4.0.0)
## zlibbioc 1.34.0 2020-04-27 [2] Bioconductor
##
## [1] /tmp/RtmpPeiDJq/Rinst1ce8597754fa
## [2] /home/biocbuild/bbs-3.11-bioc/R/library
While the BenchDesign and SummarizedBenchmark classes are designed to enable easier reproduction and replication of benchmark experiments, they are not meant to completely replace properly written and documented analyses. If any functions or packages are needed for the methods defined in the BenchDesign, they must be available and sourced or loaded when running (or re-running) the analyses. As an example, suppose we have the following locally defined method, mymethod()
, which depends on another locally defined, but unavailable, function, mysubmethod()
.
mymethod <- function(x) {
mysubmethod(x)
}
mybd <- BenchDesign(data = list(vals = 1:5))
mybd <- addMethod(mybd, "method1", mymethod,
params = rlang::quos(x = vals))
The top level mymethod()
is captured in the corresponding BDMethod and BenchDesign. No errors are thrown since including mymethod()
in a BenchDesign object does not require evaluating the function.
## function(x) {
## mysubmethod(x)
## }
However, when buildBench()
is called, the method naturally fails because the necessary mysubmethod()
is not available in the current session.
## !! error caught in buildBench !!
## !! error in main function of method: 'method1'
## !! original message:
## !! could not find function "mysubmethod"
## <simpleError in buildBench(mybd):
## No method returned valid values for any assays!
## >
Therefore, if any locally defined functions are necessary for a method, they should be available along with the BenchDesign object or defined at the top level. Additionally, for reproducibility and clarity, we generally recommend keeping functions as “self-contained” as possible, and never relying on variables defined externally, e.g. in the global enviroment. Scoping with functions can be confusing, and it may not always be clear where constants are being evaluated. As an example, consider the following simple multiplying function.
m <- 5
mymult <- function(x) {
m * x
}
m <- 2
mybd <- BenchDesign(data = list(vals = 1:5, m = 10))
mybd <- addMethod(mybd, "methodr", mymult,
params = rlang::quos(x = vals))
While experienced R programmers may know which value of m
(5, 2 or 10) will be used when buildBench()
is called, for many users, this is less obvious.
## methodr
## [1,] 2
## [2,] 4
## [3,] 6
## [4,] 8
## [5,] 10
The answer is the value assigned most recently in the environment where mymult()
was defined, R_GlobalEnv
, the global environment (m = 2
). Note, however, that if the BenchDesign is saved and reloaded in a new session, whatever value of m
is defined in the global environment of the new session will be used. If m
is not defined in the new session, an error will be thrown. In this case, m
should either be explicitly defined within the function or passed as a second input variable of mymult()
and defined with params =
in the BDMethod.
While it may be possible to aggressively capture all code and environment variables defined with each method in a BenchDesign, it is not within the scope of this package, and furthermore, may not be the best appraoch to constructing a reproducible benchmark. While redundant, for emphasis, we reiterate that the SummarizedBenchmark
framework is a solution for structuring benchmark experiments which complements, but does not replace, well documented and organized code.
Several less prominent, but hopefully useful features are also implemented for working with objects of the BenchDesign, BDMethod, and BDMethodList classes.
As seen in the example above, metadata for methods are stored in the colData()
of SummarizedBenchmark objects. Several default metadata columns are populated in the colData()
of the SummarizedBenchmark object generated by a call to buildBench()
. However, sometimes it may be useful to include additional metadata columns beyond just the defaults. While this can be accomplished manually by modifying the colData()
of the SummarizedBenchmark object post hoc, method metadata can also be specified at the addMethod()
step using the meta =
optional parameter. The meta =
parameter accepts a named list of metadata information. Each list entry will be added to the colData()
as a new column, and should be the same type across methods to prevent possible errors. To avoid collisions between metadata columns specified with meta =
and the default set of columns, metadata specified using meta =
will be added to colData()
with meta.
prefixed to the column name.
As an example, we construct a BenchDesign object again using the multiple hypothesis testing example. The BenchDesign is created with two methods, Bonferroni correction and the Benjamini-Hochberg step-up procedure. Each method is specified with the optional meta =
parameter noting the type of control provided, either familywise error rate (FWER) or false discovery rate (FDR). We can verify that the manually defined metadata column (meta.type
) is available in the colData()
of the newly generated SummarizedBenchmark.
b_withmeta <- BenchDesign(data = tdat) %>%
addMethod(label = "bonf", func = p.adjust,
meta = list(type = "FWER control"),
params = rlang::quos(p = pval, method = "bonferroni")) %>%
addMethod(label = "BH",
func = p.adjust,
meta = list(type = "FDR control"),
params = rlang::quos(p = pval, method = "BH"))
sb_withmeta <- buildBench(b_withmeta)
colData(sb_withmeta)
## DataFrame with 2 rows and 7 columns
## func.pkg func.pkg.vers func.pkg.manual param.p param.method
## <character> <character> <logical> <character> <character>
## bonf stats 4.0.0 FALSE pval "bonferroni"
## BH stats 4.0.0 FALSE pval "BH"
## meta.type session.idx
## <character> <numeric>
## bonf FWER control 1
## BH FDR control 1
While all methods in this example had the meta =
option specified, this is not necessary. It is completely acceptable to specify the meta =
parameter for only a subset of methods. Missing entries will be recorded as NA
s in the colData()
.
Arguably, two of the most important pieces of metadata stored in the colData()
of the SummarizedBenchmark returned by buildBench()
are the relevant package name and version (pkg_name
, pkg_vers
). Determining the package name and version requires the primary “workhorse” function of the method be directly specified as func =
in the addMethod()
call. In some cases, this may not be possible, e.g. if the “workhorse” function is a wrapper. However, there still might exist an important function for which we would like to track the package name and version. The meta
parameter can help.
The meta =
parameter will handle the following named list entries as special values: pkg_name
, pkg_vers
, pkg_func
. First, if values are specified for pkg_name
and pkg_vers
in meta =
, these will overwrite the values determined from func =
. To trace the source of pkg_name
and pkg_vers
information, the func.pkg.manual
column of the colData
will be set to TRUE
(the default value is FALSE
). Alternatively, a function can be passed to meta =
as pkg_func
. This function will be used to determine both pkg_name
and pkg_vers
, and will take precendence over manually specified pkg_name
and pkg_vers
values. If pkg_func
is specified, it will be included in the colData()
as a new column with the same name, and the vers_src
column will be set to "meta_func"
. **Note: the function should be wrapped in rlang::quo
to be properly parsed.
We illustrate the behavior when using either pkg_func
or pkg_name
and pkg_vers
with the meta
optional parameter using the same multiple testing example from above. First, notice that Storey’s q-value method was previously included in the BenchDesign with a post =
argument to extract q-values from the result of calling qvalue::qvalue
on the data.
## qv ---------------------------------------------------------
## BenchDesign Method (BDMethod) ------------------------------
## method: qvalue::qvalue
## function (p, fdr.level = NULL, pfdr = FALSE, lfdr.out...
## parameters:
## p : pval
## post:
## default : function (x) { ...
## meta:
## none
We can avoid specifying post =
by using the following wrapper function in place of the direct call to qvalue::qvalue
.
qv_wrapper <- function(...) {
qvalue::qvalue(...)$qvalues
}
b_qv <- BenchDesign(data = tdat) %>%
addMethod(label = "qv",
func = qvalue::qvalue,
params = rlang::quos(p = pval),
post = function(x) { x$qvalues }) %>%
addMethod(label = "qv_wrapper",
func = qv_wrapper,
params = rlang::quos(p = pval))
However, we see that while the results for the two methods are the same, the colData()
of the resulting SummarizedBenchmark object is missing the valuable package name and version information for the wrapper-based approach.
## qv qv_wrapper
## [1,] 0.0079813228 0.0079813228
## [2,] 0.3290660527 0.3290660527
## [3,] 0.0003753518 0.0003753518
## [4,] 0.0442094809 0.0442094809
## [5,] 0.1716134119 0.1716134119
## [6,] 0.1757029652 0.1757029652
## DataFrame with 2 rows and 5 columns
## func.pkg func.pkg.vers func.pkg.manual param.p session.idx
## <character> <character> <logical> <character> <numeric>
## qv qvalue 2.20.0 FALSE pval 1
## qv_wrapper NA NA FALSE pval 1
This can be fixed by specifying either the pkg_func
or pkg_name
and pkg_vers
metadata values.
b_qv <- b_qv %>%
addMethod(label = "qv_pkgfunc",
func = qv_wrapper,
meta = list(pkg_func = qvalue::qvalue),
params = rlang::quos(p = pval)) %>%
addMethod(label = "qv_pkgname",
func = qv_wrapper,
meta = list(pkg_name = "qvalue", pkg_vers = as.character(packageVersion("qvalue"))),
params = rlang::quos(p = pval))
sb_qv <- buildBench(b_qv)
colData(sb_qv)
## DataFrame with 4 rows and 5 columns
## func.pkg func.pkg.vers func.pkg.manual param.p session.idx
## <character> <character> <logical> <character> <numeric>
## qv qvalue 2.20.0 FALSE pval 1
## qv_wrapper NA NA FALSE pval 1
## qv_pkgfunc qvalue 2.20.0 TRUE pval 1
## qv_pkgname qvalue 2.20.0 TRUE pval 1
The func.pkg
and func.pkg.vers
columns of the colData()
are now populated with the correct information, and furthermore, the func.pkg.manual
column is set to TRUE
for the two latter methods.
Modifying the defintion of a method after it has been added to a BenchDesign is supported by the modifyMethod()
function. The BenchDesign object created in the multiple testing example above, b
, includes a method called qv
.
## qv ---------------------------------------------------------
## BenchDesign Method (BDMethod) ------------------------------
## method: qvalue::qvalue
## function (p, fdr.level = NULL, pfdr = FALSE, lfdr.out...
## parameters:
## p : pval
## post:
## default : function (x) { ...
## meta:
## none
Suppose we wish to specify an additional pi0 =
parameter for the qvalue::qvalue
method as well as add a new metadata tag. This can be accomplished by passing both new parameters to the params =
parameter of modifyMethod()
as a rlang::quos(..)
list. By default, all entries in the rlang::quos(..)
list are assumed to be parameters for the main function of the method. However, func
, post
, or meta
values can also be modified by included in their new values in the list using the special keywords, bd.func =
, bd.post =
, or bd.meta =
. In the example below, we both add a new parameter , pi0 =
and modify the meta =
specification.
new_params <- rlang::quos(pi0 = 0.9, bd.meta = list(type = "FDR control"))
b_modqv <- modifyMethod(b, label = "qv", params = new_params)
printMethod(b_modqv, "qv")
## qv ---------------------------------------------------------
## BenchDesign Method (BDMethod) ------------------------------
## method: qvalue::qvalue
## function (p, fdr.level = NULL, pfdr = FALSE, lfdr.out...
## parameters:
## p : pval
## pi0 : 0.9
## post:
## default : function (x) { ...
## meta:
## type : "FDR control"
We can see that the original method parameters, namely p = pval
, is still defined for method, while the new method parameter pi0 = 0.9
has been added along with the new metadata specification. However, sometimes it may be desirable to completely overwrite all function parameters for a method, e.g. if we decide to completely redefine the parameters of a method. This may occur if some parameters were optional and originally specified, but no longer necessary. All function parameters can be overwritten by specifying .overwrite = TRUE
.
b_modqv <- modifyMethod(b, label = "qv", params = new_params, .overwrite = TRUE)
printMethod(b_modqv, "qv")
## qv ---------------------------------------------------------
## BenchDesign Method (BDMethod) ------------------------------
## method: qvalue::qvalue
## function (p, fdr.level = NULL, pfdr = FALSE, lfdr.out...
## parameters:
## pi0 : 0.9
## post:
## default : function (x) { ...
## meta:
## type : "FDR control"
Now the originally specified p = pval
is no longer included in the method definition. Obviously, this is not reasonable in this setting as the p
parameter is a necessary parameter to run the method. Note that specifications of bd.meta =
will always overwrite the existing metadata values of the method, even when .overwrite = FALSE
.
In addition to comparing multiple methods, a benchmark study may also involve comparing a single method across several parameter settings. The expandMethod()
function provides the capability to take a method already defined in the BenchDesign, and expand it to multiple methods with differing parameter values in the BenchDesign object. In the following example, expandMethod()
is used to duplicate the qv
method with only the pi0
parameter modified. By default, the expanded method, in this case qv
, is dropped. To prevent this from happening, .replace = FALSE
can be specified.
Expanding the qv
method with expandMethod()
is accomplished by specifying any new parameters of the variant methods to params =
as a rlang::quos(..)
list. Here, two new variants of the qv
method, qv-p90
and qv-p10
, are created with new values of the pi0
parameter.
b_expand <- expandMethod(b, label = "qv",
params = list(qv_p90 = rlang::quos(pi0 = 0.90),
qv_p10 = rlang::quos(pi0 = 0.10)),
.replace = FALSE)
b_expand
## BenchDesign ------------------------------------------------
## benchmark data:
## type: md5hash
## MD5 hash: 09867f2696d3178a8f3bc038917042e1
## benchmark methods:
## method: BH; func: p.adjust
## method: qv; func: qvalue::qvalue
## method: bonf; func: p.adjust
## method: qv_p90; func: qvalue::qvalue
## method: qv_p10; func: qvalue::qvalue
## qv_p90 -----------------------------------------------------
## BenchDesign Method (BDMethod) ------------------------------
## method: qvalue::qvalue
## function (p, fdr.level = NULL, pfdr = FALSE, lfdr.out...
## parameters:
## p : pval
## pi0 : 0.9
## post:
## default : function (x) { ...
## meta:
## none
## qv_p10 -----------------------------------------------------
## BenchDesign Method (BDMethod) ------------------------------
## method: qvalue::qvalue
## function (p, fdr.level = NULL, pfdr = FALSE, lfdr.out...
## parameters:
## p : pval
## pi0 : 0.1
## post:
## default : function (x) { ...
## meta:
## none
When only a single parameter is modifed as in the example above, an alternative shorthand is provided. The parameter can be specified with onlyone =
and the corresponding values specified with the new method names to params =
.
b_expand <- expandMethod(b, label = "qv", onlyone = "pi0",
params = rlang::quos(qv_p90 = 0.90,
qv_p10 = 0.10),
.replace = FALSE)
b_expand
## BenchDesign ------------------------------------------------
## benchmark data:
## type: md5hash
## MD5 hash: 09867f2696d3178a8f3bc038917042e1
## benchmark methods:
## method: BH; func: p.adjust
## method: qv; func: qvalue::qvalue
## method: bonf; func: p.adjust
## method: qv_p90; func: qvalue::qvalue
## method: qv_p10; func: qvalue::qvalue
The resulting modified BenchDesign object is the same.
Aside from the standard SummarizedBenchmark
Throughout our documentation, SummarizedBenchmark objects are constructed exclusively using our recommended workflow by running buildBench()
with a BenchDesign object. This recommended approach enables users to perform benchmark comparisons while automatically keeping track of parameters and software metadata. However, it is also possible to skip this recommended approach and construct a SummarizedBenchmark object directly from standard S3
data objects.
Using data from the iCOBRApackage (Soneson and Robinson 2016), this part of the vignette demonstrates how to build SummarizedBenchmark objects from S3
objects. The dataset contains differential expression results of three different methods (limma, edgeR and DESeq2) applied to a simulated RNA-seq dataset.
The process of building a SummarizedBenchmark object is similar to the process of constructing a SummarizedExperiment object. To build a SummarizedBenchmark object, three main objects are required: (1) a list where each element corresponds to a data.frame, (2) a DataFrame with annotations of the methods, and (3) when available, a DataFrame of ground truths.
In the SummarizedBenchmark object, each output of the methods is considered a different assay
. For example, using the differential expression dataset example, we can define two assays, q-values and estimated log fold changes. For each assay
, we arrange the output of the different methods as a matrix where each column corresponds to a method and each row corresponds to each feature (in this case, genes). We will need a list in which each of it’s element corresponds to an assay.
assays <- list(qvalue = cobradata_example@padj,
logFC = cobradata_example@score)
assays[["qvalue"]]$DESeq2 <- p.adjust(cobradata_example@pval$DESeq2, method = "BH")
head(assays[["qvalue"]], 3)
## edgeR voom DESeq2
## ENSG00000000457 0.2833295 0.1375146 0.04760152
## ENSG00000000460 0.2700376 0.2826398 0.08717737
## ENSG00000000938 0.9232794 0.9399669 0.99856748
## edgeR voom DESeq2
## ENSG00000000457 0.22706080 0.23980596 0.274811698
## ENSG00000000460 0.31050478 0.27677537 0.359226203
## ENSG00000000938 0.04985025 0.03344113 0.001526355
Since these are simulated data, the ground truths for both assays are known and available. We can format these as a DataFrame where each column corresponds to an assay and each row corresponds to a feature.
library(S4Vectors)
groundTruth <- DataFrame(cobradata_example@truth[, c("status", "logFC")])
colnames(groundTruth) <- names(assays)
groundTruth <- groundTruth[rownames(assays[[1]]), ]
head(groundTruth)
## DataFrame with 6 rows and 2 columns
## qvalue logFC
## <integer> <numeric>
## ENSG00000000457 0 0.0000000
## ENSG00000000460 0 0.0000000
## ENSG00000000938 0 0.0000000
## ENSG00000000971 0 0.0000000
## ENSG00000001460 1 3.9394220
## ENSG00000001461 1 0.0602244
Finally, the method names are also reformatted as a DataFrame of simple annotation data of the methods.
## DataFrame with 3 rows and 1 column
## method
## <character>
## 1 edgeR
## 2 voom
## 3 DESeq2
A SummarizedBenchmark is built using the following call to the SummarizedBenchmark()
constructor.
This vignette described the structure of the BenchDesign and SummarizedBenchmark classes which underly the SummarizedBenchmark
package, as well as several complex and non-standard features of both classes. Examples of using these classes and the SummarizedBenchmark
framework can be found in the SummarizedBenchmark: Full Case Study vignette as well as in the other Case Study vignettes. More advanced features are also described in various Feature vignettes.
Soneson, Charlotte, and Mark D Robinson. 2016. “iCOBRA: Open, Reproducible, Standardized and Live Method Benchmarking.” Nature Methods 13 (4):283–83. https://doi.org/10.1038/nmeth.3805.