Last modified: 2021-09-21 09:33:13
Compiled: Tue Sep 21 11:29:21 2021

1 Utilisation and prospects of bioimage datasets

In recent years, there has been a growing need for data analysis using machine learning in the field of bioimaging research. Machine learning is an inductive approach using data, and the construction of models, such as image segmentation and classification, involves the use of image data itself. Therefore, the publication and sharing of bioimage datasets [1] as well as knowledge creation through providing metadata to bioimages [2,3] are important issues to be discussed. At present, there is no commonly used format for sharing bioimage datasets. Also, the data is scattered among various repositories. Therefore, different image repositories manage the data in different formats (image data itself and metadata, including image format, instruments/microscopes and biosamples).

In the data analysis and quantification using those images, it is assumed that several steps of image pre-processing are performed depending on the analysis environment used. However, the implementation of supervised learning starts with finding a repository of the bioimage dataset that contains original images and their corresponding supervised labels. Once the repository is found, the image data is downloaded from the repository, the data is loaded into each environment and it is prepared in a format suitable for analytical package. These processes are time consuming before the main analysis. Also, in most of the image repositories, the data are not published in a format suitable for reading and processing in R (.Rdata, etc.), and the data are not easy to use for R users.

For performing supervised learning of bioimage data, BioImageDbs provides R list objects of the original images and their corresponding supervised labels converted into a 4D or 5D array. After retrieving the data from ExperimentHub, it can be utilised for deep learning using Keras/Tensorflow [4] and other machine learning methods, without the need for pre-processing.

On the other hand, many image analysis packages are also available on R; however, there is a lack of standardisation in image analysis. The use of common, open datasets is one of the essential steps in standardising and comparing the analytical methods. The provision of the array data of images through ExperimentHub is also intended for applications such as (1) comparing models using common-sharing data among R users and (2) applying predictions to new datasets through transfer learning and fine-tuning based on these arrays.

2 Fetch Bioimage Datasets from ExperimentHub

The BioImageDbs package provides the metadata for all BioImage databases in ExperimentHub.

The BioImageDbs package provides the metadata for bioimage datasets, which is preprocessed as array format and saved in ExperimentHub.

First we load/update the ExperimentHub resource.

library(ExperimentHub)
eh <- ExperimentHub()

Next we list all BioImageDbs entries from ExperimentHub.

query(eh, "BioImage")
## ExperimentHub with 23 records
## # snapshotDate(): 2021-05-18
## # $dataprovider: CELL TRACKING CHALLENGE (http://celltrackingchallenge.net/2...
## # $species: Homo sapiens, Mus musculus, Drosophila melanogaster
## # $rdataclass: List, magick-image
## # additional mcols(): taxonomyid, genome, description,
## #   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## #   rdatapath, sourceurl, sourcetype 
## # retrieve records with, e.g., 'object[["EH6095"]]' 
## 
##            title                                                            
##   EH6095 | EM_id0001_Brain_CA1_hippocampus_region_5dTensor.rds              
##   EH6096 | EM_id0001_Brain_CA1_hippocampus_region_5dTensor_train_dataset.gif
##   EH6097 | EM_id0002_Drosophila_brain_region_5dTensor.rds                   
##   EH6098 | EM_id0002_Drosophila_brain_region_5dTensor_train_dataset.gif     
##   EH6099 | LM_id0001_DIC_C2DH_HeLa_4dTensor.rds                             
##   ...      ...                                                              
##   EH6113 | LM_id0003_Fluo_N2DH_GOWT1_5dTensor.rds                           
##   EH6114 | EM_id0003_J558L_4dTensor.rds                                     
##   EH6115 | EM_id0003_J558L_4dTensor_train_dataset.gif                       
##   EH6116 | EM_id0004_PrHudata_4dTensor.rds                                  
##   EH6117 | EM_id0004_PrHudata_4dTensor_train_dataset.gif

We can confirm the metadata in ExperimentHub in Bioconductor S3 bucket with mcols().

mcols(query(eh, "BioImage"))
## DataFrame with 23 rows and 15 columns
##                         title           dataprovider                species
##                   <character>            <character>            <character>
## EH6095 EM_id0001_Brain_CA1_.. https://www.epfl.ch/..           Mus musculus
## EH6096 EM_id0001_Brain_CA1_.. https://www.epfl.ch/..           Mus musculus
## EH6097 EM_id0002_Drosophila.. the ISBI 2012 Challe.. Drosophila melanogas..
## EH6098 EM_id0002_Drosophila.. the ISBI 2012 Challe.. Drosophila melanogas..
## EH6099 LM_id0001_DIC_C2DH_H.. CELL TRACKING CHALLE..           Homo sapiens
## ...                       ...                    ...                    ...
## EH6113 LM_id0003_Fluo_N2DH_.. CELL TRACKING CHALLE..           Mus musculus
## EH6114 EM_id0003_J558L_4dTe.. Pattern Recognition ..           Mus musculus
## EH6115 EM_id0003_J558L_4dTe.. Pattern Recognition ..           Mus musculus
## EH6116 EM_id0004_PrHudata_4.. Pattern Recognition ..           Homo sapiens
## EH6117 EM_id0004_PrHudata_4.. Pattern Recognition ..           Homo sapiens
##        taxonomyid      genome            description coordinate_1_based
##         <integer> <character>            <character>          <integer>
## EH6095      10090          NA 5D arrays with the b..                  1
## EH6096      10090          NA A animation file (.g..                  1
## EH6097       7227          NA 5D arrays with the b..                  1
## EH6098       7227          NA A animation file (.g..                  1
## EH6099       9606          NA 4D arrays with the m..                  1
## ...           ...         ...                    ...                ...
## EH6113      10090          NA 5D arrays with the m..                  1
## EH6114      10090          NA The mouse B myeloma ..                  1
## EH6115      10090          NA A animation file (.g..                  1
## EH6116       9606          NA The primary human T ..                  1
## EH6117       9606          NA A animation file (.g..                  1
##                    maintainer rdatadateadded preparerclass
##                   <character>    <character>   <character>
## EH6095 Satoshi Kume <satosh..     2021-05-18   BioImageDbs
## EH6096 Satoshi Kume <satosh..     2021-05-18   BioImageDbs
## EH6097 Satoshi Kume <satosh..     2021-05-18   BioImageDbs
## EH6098 Satoshi Kume <satosh..     2021-05-18   BioImageDbs
## EH6099 Satoshi Kume <satosh..     2021-05-18   BioImageDbs
## ...                       ...            ...           ...
## EH6113 Satoshi Kume <satosh..     2021-05-18   BioImageDbs
## EH6114 Satoshi Kume <satosh..     2021-05-18   BioImageDbs
## EH6115 Satoshi Kume <satosh..     2021-05-18   BioImageDbs
## EH6116 Satoshi Kume <satosh..     2021-05-18   BioImageDbs
## EH6117 Satoshi Kume <satosh..     2021-05-18   BioImageDbs
##                                          tags   rdataclass
##                                        <list>  <character>
## EH6095     3D images,bioimage,CellCulture,...         List
## EH6096     animation,bioimage,CellCulture,... magick-image
## EH6097      3D image,bioimage,CellCulture,...         List
## EH6098     animation,bioimage,CellCulture,... magick-image
## EH6099 bioimage,cell tracking,CellCulture,...         List
## ...                                       ...          ...
## EH6113 bioimage,cell tracking,CellCulture,...         List
## EH6114     2D images,bioimage,CellCulture,...         List
## EH6115     2D images,bioimage,CellCulture,... magick-image
## EH6116     2D images,bioimage,CellCulture,...         List
## EH6117     2D images,bioimage,CellCulture,... magick-image
##                     rdatapath              sourceurl  sourcetype
##                   <character>            <character> <character>
## EH6095 BioImageDbs/v01/EM_i.. https://github.com/k..         PNG
## EH6096 BioImageDbs/v01/EM_i.. https://github.com/k..         PNG
## EH6097 BioImageDbs/v01/EM_i.. https://github.com/k..         PNG
## EH6098 BioImageDbs/v01/EM_i.. https://github.com/k..         PNG
## EH6099 BioImageDbs/v01/LM_i.. https://github.com/k..         PNG
## ...                       ...                    ...         ...
## EH6113 BioImageDbs/v01/LM_i.. https://github.com/k..         PNG
## EH6114 BioImageDbs/v01/EM_i.. https://github.com/k..         PNG
## EH6115 BioImageDbs/v01/EM_i.. https://github.com/k..         PNG
## EH6116 BioImageDbs/v01/EM_i.. https://github.com/k..         PNG
## EH6117 BioImageDbs/v01/EM_i.. https://github.com/k..         PNG

We can retrieve only the BioImageDbs tibble files as follows.

qr <- query(eh, c("BioImageDbs", "LM_id0001"))
qr
## ExperimentHub with 5 records
## # snapshotDate(): 2021-05-18
## # $dataprovider: CELL TRACKING CHALLENGE (http://celltrackingchallenge.net/2...
## # $species: Homo sapiens
## # $rdataclass: List, magick-image
## # additional mcols(): taxonomyid, genome, description,
## #   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## #   rdatapath, sourceurl, sourcetype 
## # retrieve records with, e.g., 'object[["EH6099"]]' 
## 
##            title                                                    
##   EH6099 | LM_id0001_DIC_C2DH_HeLa_4dTensor.rds                     
##   EH6100 | LM_id0001_DIC_C2DH_HeLa_4dTensor_train_dataset.gif       
##   EH6101 | LM_id0001_DIC_C2DH_HeLa_4dTensor_Binary.rds              
##   EH6102 | LM_id0001_DIC_C2DH_HeLa_4dTensor_Binary_train_dataset.gif
##   EH6103 | LM_id0001_DIC_C2DH_HeLa_5dTensor.rds
#Import data
#BioImageDbs_image_Dat <- qr[[1]]

3 5D Arrays from the ExperimentHub

The ordering of the array dimensions corresponds to the channels_last format (default) in R/Keras. The input shape of 5D array is to be batch, spatial_dim1, spatial_dim2, spatial_dim3 and channels. The number of this batch is the same as the number of the 3D image sets. The number of channels is 1 for grey images and 3 for RGB images.

4 4D Arrays from the ExperimentHub

The ordering of the array dimensions corresponds to the channels_last format (default) in R/Keras. The input shape of 4D array is to be batch, height, width and channels. The number of this batch is the same as the number of the 2D images.

5 Visualization of gif images from the ExperimentHub

As a test, we also provided gif files of some arrays for visualizations. We visualize the files using magick::image_read function.

qr <- query(eh, c("BioImageDbs", ".gif"))
qr
## ExperimentHub with 10 records
## # snapshotDate(): 2021-05-18
## # $dataprovider: CELL TRACKING CHALLENGE (http://celltrackingchallenge.net/2...
## # $species: Homo sapiens, Mus musculus, Drosophila melanogaster
## # $rdataclass: magick-image
## # additional mcols(): taxonomyid, genome, description,
## #   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## #   rdatapath, sourceurl, sourcetype 
## # retrieve records with, e.g., 'object[["EH6096"]]' 
## 
##            title                                                            
##   EH6096 | EM_id0001_Brain_CA1_hippocampus_region_5dTensor_train_dataset.gif
##   EH6098 | EM_id0002_Drosophila_brain_region_5dTensor_train_dataset.gif     
##   EH6100 | LM_id0001_DIC_C2DH_HeLa_4dTensor_train_dataset.gif               
##   EH6102 | LM_id0001_DIC_C2DH_HeLa_4dTensor_Binary_train_dataset.gif        
##   EH6105 | LM_id0002_PhC_C2DH_U373_4dTensor_train_dataset.gif               
##   EH6107 | LM_id0002_PhC_C2DH_U373_4dTensor_Binary_train_dataset.gif        
##   EH6110 | LM_id0003_Fluo_N2DH_GOWT1_4dTensor_train_dataset.gif             
##   EH6112 | LM_id0003_Fluo_N2DH_GOWT1_4dTensor_Binary_train_dataset.gif      
##   EH6115 | EM_id0003_J558L_4dTensor_train_dataset.gif                       
##   EH6117 | EM_id0004_PrHudata_4dTensor_train_dataset.gif
#EM_id0001_Brain_CA1_hippocampus_region_5dTensor_train_data
qr[1]
## ExperimentHub with 1 record
## # snapshotDate(): 2021-05-18
## # names(): EH6096
## # package(): BioImageDbs
## # $dataprovider: https://www.epfl.ch/labs/cvlab/data/data-em/
## # $species: Mus musculus
## # $rdataclass: magick-image
## # $rdatadateadded: 2021-05-18
## # $title: EM_id0001_Brain_CA1_hippocampus_region_5dTensor_train_dataset.gif
## # $description: A animation file (.gif) of the train dataset of EM_id0001_Br...
## # $taxonomyid: 10090
## # $genome: NA
## # $sourcetype: PNG
## # $sourceurl: https://github.com/kumeS/BioImageDbs
## # $sourcesize: NA
## # $tags: c("animation", "bioimage", "CellCulture", "electron
## #   microscopy", "microscope", "scanning electron microscopy",
## #   "segmentation", "Tissue") 
## # retrieve record with 'object[["EH6096"]]'
##Display the gif image
#magick::image_read(qr[[1]])
EM_id0001_Brain_CA1_hippocampus_region_5dTensor_train_dataset.gif

Figure 1: EM_id0001_Brain_CA1_hippocampus_region_5dTensor_train_dataset.gif

EM_id0002_Drosophila_brain_region_5dTensor_train_dataset.gif

Figure 2: EM_id0002_Drosophila_brain_region_5dTensor_train_dataset.gif

EM_id0003_J558L_4dTensor_train_dataset.gif

Figure 3: EM_id0003_J558L_4dTensor_train_dataset.gif

EM_id0004_PrHudata_4dTensor_train_dataset.gif

Figure 4: EM_id0004_PrHudata_4dTensor_train_dataset.gif

EM_id0005_Mouse_Kidney_2D_All_Mito_1024_4dTensor_dataset.gif

Figure 5: EM_id0005_Mouse_Kidney_2D_All_Mito_1024_4dTensor_dataset.gif

EM_id0005_Mouse_Kidney_2D_All_Nuc_1024_4dtensor.Rds

Figure 6: EM_id0005_Mouse_Kidney_2D_All_Nuc_1024_4dtensor.Rds

EM_id0006_Rat_Liver_2D_All_Mito_1024_4dTensor_dataset.gif

Figure 7: EM_id0006_Rat_Liver_2D_All_Mito_1024_4dTensor_dataset.gif

EM_id0006_Rat_Liver_2D_All_Nuc_1024_4dTensor_dataset.gif

Figure 8: EM_id0006_Rat_Liver_2D_All_Nuc_1024_4dTensor_dataset.gif

EM_id0007_Mouse_Kidney_MultiScale_All_Low_Glomerulus_1024_4dTensor_dataset.gif

Figure 9: EM_id0007_Mouse_Kidney_MultiScale_All_Low_Glomerulus_1024_4dTensor_dataset.gif

EM_id0007_Mouse_Kidney_MultiScale_All_Middle_Podocyte_1024_4dTensor_dataset.gif

Figure 10: EM_id0007_Mouse_Kidney_MultiScale_All_Middle_Podocyte_1024_4dTensor_dataset.gif

EM_id0008_Human_NB4_2D_All_Cel_512_4dTensor_dataset.gif

Figure 11: EM_id0008_Human_NB4_2D_All_Cel_512_4dTensor_dataset.gif

EM_id0008_Human_NB4_2D_All_Nuc_1024_4dTensor_dataset.gif

Figure 12: EM_id0008_Human_NB4_2D_All_Nuc_1024_4dTensor_dataset.gif

EM_id0009_MurineBMMC_All_512_4dTensor_dataset.gif

Figure 13: EM_id0009_MurineBMMC_All_512_4dTensor_dataset.gif

EM_id0010_HumanBlast_All_512_4dTensor_dataset.gif

Figure 14: EM_id0010_HumanBlast_All_512_4dTensor_dataset.gif

EM_id0011_HumanJurkat_All_512_4dTensor_dataset.gif

Figure 15: EM_id0011_HumanJurkat_All_512_4dTensor_dataset.gif

LM_id0001_DIC_C2DH_HeLa_4dTensor_train_dataset.gif

Figure 16: LM_id0001_DIC_C2DH_HeLa_4dTensor_train_dataset.gif

LM_id0002_PhC_C2DH_U373_4dTensor_train_dataset.gif

Figure 17: LM_id0002_PhC_C2DH_U373_4dTensor_train_dataset.gif

LM_id0003_Fluo_N2DH_GOWT1_4dTensor_train_dataset.gif

Figure 18: LM_id0003_Fluo_N2DH_GOWT1_4dTensor_train_dataset.gif

6 A simple execution command using Keras/Tensorflow

We select a data array and a label array from the data list and assign them to variables. These variables are then used as the x and y arguments of the fit (<keras.engine.training.Model>) function of Keras as an example. The model in Keras should be prepared before the execution.

## Not Run ##
# qr <- query(eh, c("BioImageDbs"))
# BioImageData <- qr[[1]]
# data <- BioImageData$Train$Train_Original
# labels <- BioImageData$Train$Train_GroundTruth
# dim(data); dim(labels)
# model %>% fit( x = data, y = labels )

7 About the imaging dataset and its metadata in BioImageDbs

For this dataset in BioImageDbs, the published open data was used as follows:

  1. For cellular ultra-microstructures, electron microscopy-based imaging data of mouse B myeloma cell line J558L (ex. EM_id0003_J558L_4dTensor.Rda) [5] and primary human T cell isolated from peripheral blood mononuclear cells (ex. EM_id0004_PrHudata_4dTensor.Rda) [5], Human NB-4 cell (ex. EM_id0008_Human_NB4_2D_All_Cel_512_4dTensor.Rds) [3], murine bone marrow derived-mast cells (ex. EM_id0009_MurineBMMC_All_512_4dTensor.Rds) [5], human blasts (ex. EM_id0010_HumanBlast_All_512_4dTensor.Rds) [5], and human T-cell line Jurkat (ex. EM_id0011_HumanJurkat_All_512_4dTensor.Rds) [5] were used.

  2. For bio-tissue ultra-microstructures, electron microscopy-based imaging data of the mouse brain (ex. EM_id0001_Brain_CA1_hippocampus_region_5dTensor.Rda) [6,7], Drosophila brain (ex. EM_id0002_Drosophila_brain_region_5dTensor.Rda) [8,9], mouse kidney (ex. EM_id0005_Mouse_Kidney_2D_All_Nuc_1024_4dtensor.Rds) [10] and rat liver (ex. EM_id0006_Rat_Liver_2D_All_Mito_1024_4dtensor.Rds) [10] were used.

  3. For cell tracking, light microscopy-based imaging data of the human HeLa cells on a flat glass (ex. LM_id0001_DIC_C2DH_HeLa_4dTensor.Rda) [11,12], human glioblastoma-astrocytoma U373 cells on a polyacrylamide substrate (ex. LM_id0002_PhC_C2DH_U373_4dTensor.Rda) [11,12] and GFP-GOWT1 mouse stem cells (ex. LM_id0003_Fluo_N2DH_GOWT1_4dTensor.Rda) [13] were used.

The values of the supervised labels were provided as array data with binary or multiple values. The detailed information was described in the metadata file of BioImageDbs. Some of cell tracking data were obtained from the cell tracking challenge.

Session information

## R version 4.1.1 (2021-08-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.2 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.13-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.13-bioc/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] parallel  stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
## [1] ExperimentHub_2.0.0 AnnotationHub_3.0.1 BiocFileCache_2.0.0
## [4] dbplyr_2.1.1        BiocGenerics_0.38.0 BiocStyle_2.20.2   
## 
## loaded via a namespace (and not attached):
##  [1] Biobase_2.52.0                httr_1.4.2                   
##  [3] sass_0.4.0                    bit64_4.0.5                  
##  [5] jsonlite_1.7.2                bslib_0.3.0                  
##  [7] shiny_1.6.0                   assertthat_0.2.1             
##  [9] interactiveDisplayBase_1.30.0 highr_0.9                    
## [11] BiocManager_1.30.16           stats4_4.1.1                 
## [13] blob_1.2.2                    GenomeInfoDbData_1.2.6       
## [15] tiff_0.1-8                    yaml_2.2.1                   
## [17] BiocVersion_3.13.1            pillar_1.6.2                 
## [19] RSQLite_2.2.8                 lattice_0.20-44              
## [21] glue_1.4.2                    digest_0.6.27                
## [23] promises_1.2.0.1              XVector_0.32.0               
## [25] htmltools_0.5.2               httpuv_1.6.3                 
## [27] pkgconfig_2.0.3               magick_2.7.3                 
## [29] bookdown_0.24                 zlibbioc_1.38.0              
## [31] purrr_0.3.4                   xtable_1.8-4                 
## [33] fftwtools_0.9-11              jpeg_0.1-9                   
## [35] later_1.3.0                   tibble_3.1.4                 
## [37] KEGGREST_1.32.0               EBImage_4.34.0               
## [39] generics_0.1.0                IRanges_2.26.0               
## [41] ellipsis_0.3.2                cachem_1.0.6                 
## [43] withr_2.4.2                   magrittr_2.0.1               
## [45] crayon_1.4.1                  mime_0.11                    
## [47] memoise_2.0.0                 evaluate_0.14                
## [49] fansi_0.5.0                   tools_4.1.1                  
## [51] lifecycle_1.0.0               stringr_1.4.0                
## [53] S4Vectors_0.30.0              locfit_1.5-9.4               
## [55] AnnotationDbi_1.54.1          Biostrings_2.60.2            
## [57] compiler_4.1.1                jquerylib_0.1.4              
## [59] GenomeInfoDb_1.28.4           rlang_0.4.11                 
## [61] grid_4.1.1                    RCurl_1.98-1.5               
## [63] htmlwidgets_1.5.4             rappdirs_0.3.3               
## [65] bitops_1.0-7                  rmarkdown_2.11               
## [67] abind_1.4-5                   DBI_1.1.1                    
## [69] curl_4.3.2                    R6_2.5.1                     
## [71] knitr_1.34                    dplyr_1.0.7                  
## [73] fastmap_1.1.0                 bit_4.0.4                    
## [75] utf8_1.2.2                    filelock_1.0.2               
## [77] stringi_1.7.4                 Rcpp_1.0.7                   
## [79] vctrs_0.3.8                   png_0.1-7                    
## [81] tidyselect_1.1.1              xfun_0.26

References

1 Williams, E., Moore, J., Li, S., Rustici, G., Tarkowska, A., Chessel, A., Leo, S., Antal, B., Ferguson, R., Sarkans, U., et al. (2017) Image data resource: A bioimage data integration and publication platform. Nature Methods, Springer Nature 14, 775–781.

2 Kobayashi, N., Kume, S., Lenz, K. and Masuya, H. (2018) RIKEN metadatabase: A database platform for health care and life sciences as a microcosm of linked open data cloud. International Journal on Semantic Web and Information Systems (IJSWIS) 14, 140–164.

3 Kume, S., Masuya, H., Maeda, M., Suga, M., Kataoka, Y. and Kobayashi, N. (2017) Development of semantic web-based imaging database for biological morphome. In Semantic technology (Wang, Z., Turhan, A.-Y., Wang, K., and Zhang, X., eds.), pp 277–285, Springer International Publishing, Cham.

4 Chollet, F., Allaire, J. and others. (2017) R interface to keras, https://github.com/rstudio/keras; GitHub.

5 Morath, V., Keuper, M., Rodriguez-Franco, M., Deswal, S., Fiala, G., Blumenthal, B., Kaschek, D., Timmer, J., Neuhaus, G., Ehl, S., et al. (2013) Semi-automatic determination of cell surface areas used in systems biology. Frontiers in Bioscience-Elite 5, 533–545.

6 Lucchi, A., Smith, K., Achanta, R., Knott, G. and Fua, P. (2012) Supervoxel-based segmentation of mitochondria in em image stacks with learned shape features. IEEE transactions on medical imaging 31, 474–486.

7 Lucchi, A., Li, Y. and Fua, P. (2013) Learning for structured prediction using approximate subgradient descent with working sets. In 2013 ieee conference on computer vision and pattern recognition, pp 1987–1994.

8 Cardona, A., Saalfeld, S., Preibisch, S., Schmid, B., Cheng, A., Pulokas, J., Tomancak, P. and Hartenstein, V. (2010) An integrated micro- and macroarchitectural analysis of the drosophila brain by computer-assisted serial section electron microscopy. PLOS Biology, Public Library of Science 8, 1–17.

9 Arganda-Carreras, I., Turaga, S. C., Berger, D. R., Cireşan, D., Giusti, A., Gambardella, L. M., Schmidhuber, J., Laptev, D., Dwivedi, S., Buhmann, J. M., et al. (2015) Crowdsourcing the creation of image segmentation algorithms for connectomics. Frontiers in Neuroanatomy 9, 142.

10 Kume, S., Masuya, H., Kataoka, Y. and Kobayashi, N. (2016) Development of an ontology for an integrated image analysis platform to enable global sharing of microscopy imaging data. In Proceedings of the 15th international semantic web conference (iswc2016).

11 Maška, M., Ulman, V., Svoboda, D., Matula, P., Matula, P., Ederra, C., Urbiola, A., España, T., Venkatesan, S., Balak, D. M., et al. (2014) A benchmark for comparison of cell tracking algorithms. Bioinformatics 30, 1609–1617.

12 Ulman, V., Maška, M., Magnusson, K. E. G., Ronneberger, O., Haubold, C., Harder, N., Matula, P., Matula, P., Svoboda, D., Radojevic, M., et al. (2017) An objective comparison of cell-tracking algorithms. Nature Methods 14, 1141–1152.

13 Bártová, E., Šustáčková, G., Stixová, L., Kozubek, S., Legartová, S. and Foltánková, V. (2011) Recruitment of oct4 protein to uv-damaged chromatin in embryonic stem cells. PLOS ONE, Public Library of Science 6, 1–13.