Contents

1 Introduction

Image-based spatial data, like Xenium, is typically focused on profiling a pre-selected set of genes. Such data can achieve resolution at the level of individual molecules, preserving both single-cell and subcellular details. Additionally, these methods often capture cellular boundaries through segmentations.

The TENxXeniumData package aims to provide a curated collection of Xenium spatial transcriptomics datasets provided by 10X Genomics. These datasets are formatted into Bioconductor classes, specifically the SpatialExperiment or SpatialFeatureExperiment (SFE). Similar to SFEData, TENxXeniumData is designed as an ExperimentHub package focusing on Spatial Data, with a specific emphasis on Xenium.

A notable distinction lies in our constructed data object, where our primary focus is on Xenium data. We aim to capture detected molecules/transcripts crucial for gaining insights into subcellular details related to specific markers and the imaging data, in addition to the gene expression profile of each cell, the centroid, and the boundary of each cell. Additionally, we have chosen to employ SpatialExperiment as an alternative scheme for data representation. In this scheme, cellular segmentations are integrated into per-cell metadata of the constructed object,

2 Installation

To install the TENxXeniumData package from GitHub:

if (!require("BiocManager", quietly = TRUE))
  install.packages("BiocManager")

BiocManager::install("TENxXeniumData")

3 Available datasets

The TENxXeniumData package provides an R/Bioconductor resource for Xenium spatially-resolved data by 10X Genomics. The package currently includes the following datasets:

A list of currently available datasets can be obtained using the ExperimentHub interface:

library(SpatialExperiment)
library(SpatialFeatureExperiment)
library(TENxXeniumData)
library(BumpyMatrix)
library(SummarizedExperiment)

eh <- ExperimentHub()
(q <- query(eh, "TENxXenium"))
## ExperimentHub with 4 records
## # snapshotDate(): 2024-04-29
## # $dataprovider: NA
## # $species: Mus musculus, Homo sapiens
## # $rdataclass: SpatialFeatureExperiment, SpatialExperiment
## # additional mcols(): taxonomyid, genome, description,
## #   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## #   rdatapath, sourceurl, sourcetype 
## # retrieve records with, e.g., 'object[["EH8547"]]' 
## 
##            title             
##   EH8547 | spe_mouse_brain   
##   EH8548 | sfe_mouse_brain   
##   EH8549 | spe_human_pancreas
##   EH8550 | sfe_human_pancreas

4 Loading the data

The following examples illustrate the process of loading the provided datasets into your R session, representing them as objects of the SpatialExperiment or SpatialFeatureExperiment classes.

Loading SpatialExperiment object:

# load object
spe <- spe_mouse_brain()

# check object
spe
## class: SpatialExperiment 
## dim: 541 36554 
## metadata(0):
## assays(2): counts molecules
## rownames(541): 2010300C02Rik Acsbg1 ... Zfp536 Zfpm2
## rowData names(3): means vars cv2
## colnames(36554): 1 2 ... 36601 36602
## colData names(10): transcript_counts control_probe_counts ... nCounts
##   nGenes
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## spatialCoords names(2) : x_centroid y_centroid
## imgData names(4): sample_id image_id data scaleFactor
# here, cellular segmentations are stored in per-cell metadata 
colData(spe)
## DataFrame with 36554 rows and 10 columns
##       transcript_counts control_probe_counts control_codeword_counts cell_area
##               <numeric>            <numeric>               <numeric> <numeric>
## 1                   384                    0                       0   305.211
## 2                   146                    0                       0   176.606
## 3                    81                    0                       0   263.938
## 4                   314                    0                       0   427.810
## 5                   639                    0                       0   424.604
## ...                 ...                  ...                     ...       ...
## 36598               352                    0                       0   466.961
## 36599               412                    1                       0   576.194
## 36600               161                    0                       0   398.323
## 36601               387                    0                       0   510.762
## 36602               449                    0                       0   565.898
##       nucleus_area                cellSeg                 nucSeg   sample_id
##          <numeric>          <sfc_POLYGON>          <sfc_POLYGON> <character>
## 1         70.71469 list(c(1901.875, 190.. list(c(1903.78747558..    sample01
## 2          6.41219 list(c(1895.5, 1890... list(c(1897.625, 189..    sample01
## 3         32.78344 list(c(2362.36254882.. list(c(2361.9375, 23..    sample01
## 4         68.18594 list(c(1902.51245117.. list(c(1903.15002441..    sample01
## 5        102.95625 list(c(1914.19995117.. list(c(1913.13745117..    sample01
## ...            ...                    ...                    ...         ...
## 36598     67.37313 list(c(3336.88745117.. list(c(3337.94995117..    sample01
## 36599      6.86375 list(c(3371.10009765.. list(c(3369.61254882..    sample01
## 36600     13.23078 list(c(3325.41259765.. list(c(3330.9375, 33..    sample01
## 36601     21.31375 list(c(3321.58740234.. list(c(3322.64990234..    sample01
## 36602     43.44031 list(c(3336.88745117.. list(c(3323.07495117..    sample01
##         nCounts    nGenes
##       <numeric> <integer>
## 1           385        97
## 2           146        64
## 3            81        48
## 4           315        95
## 5           640        98
## ...         ...       ...
## 36598       352        91
## 36599       413        85
## 36600       161        57
## 36601       387        95
## 36602       449        93

Loading SpatialFeatureExperiment object:

# load object
sfe <- sfe_mouse_brain()

# check object
sfe
## class: SpatialFeatureExperiment 
## dim: 541 36554 
## metadata(0):
## assays(2): counts molecules
## rownames(541): 2010300C02Rik Acsbg1 ... Zfp536 Zfpm2
## rowData names(3): means vars cv2
## colnames(36554): 1 2 ... 36601 36602
## colData names(8): transcript_counts control_probe_counts ... nCounts
##   nGenes
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## spatialCoords names(2) : x_centroid y_centroid
## imgData names(4): sample_id image_id data scaleFactor
## 
## unit:
## Geometries:
## colGeometries: centroids (POINT) 
## annotGeometries: cellSeg (POINT), nucSeg (POINT) 
## 
## Graphs:
## sample01:

5 Session information

sessionInfo()
## R version 4.4.0 (2024-04-24)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.4 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.19-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] BumpyMatrix_1.12.0             TENxXeniumData_1.0.0          
##  [3] ExperimentHub_2.12.0           AnnotationHub_3.12.0          
##  [5] BiocFileCache_2.12.0           dbplyr_2.5.0                  
##  [7] SpatialFeatureExperiment_1.6.1 SpatialExperiment_1.14.0      
##  [9] SingleCellExperiment_1.26.0    SummarizedExperiment_1.34.0   
## [11] Biobase_2.64.0                 GenomicRanges_1.56.0          
## [13] GenomeInfoDb_1.40.0            IRanges_2.38.0                
## [15] S4Vectors_0.42.0               BiocGenerics_0.50.0           
## [17] MatrixGenerics_1.16.0          matrixStats_1.3.0             
## [19] BiocStyle_2.32.0              
## 
## loaded via a namespace (and not attached):
##   [1] DBI_1.2.2                 bitops_1.0-7             
##   [3] deldir_2.0-4              s2_1.1.6                 
##   [5] rlang_1.1.3               magrittr_2.0.3           
##   [7] RSQLite_2.3.6             e1071_1.7-14             
##   [9] compiler_4.4.0            DelayedMatrixStats_1.26.0
##  [11] png_0.1-8                 sfheaders_0.4.4          
##  [13] fftwtools_0.9-11          vctrs_0.6.5              
##  [15] pkgconfig_2.0.3           wk_0.9.1                 
##  [17] crayon_1.5.2              fastmap_1.2.0            
##  [19] magick_2.8.3              XVector_0.44.0           
##  [21] scuttle_1.14.0            utf8_1.2.4               
##  [23] rmarkdown_2.26            UCSC.utils_1.0.0         
##  [25] purrr_1.0.2               bit_4.0.5                
##  [27] xfun_0.44                 zlibbioc_1.50.0          
##  [29] cachem_1.0.8              beachmat_2.20.0          
##  [31] jsonlite_1.8.8            blob_1.2.4               
##  [33] rhdf5filters_1.16.0       DelayedArray_0.30.1      
##  [35] Rhdf5lib_1.26.0           BiocParallel_1.38.0      
##  [37] jpeg_0.1-10               tiff_0.1-12              
##  [39] terra_1.7-71              parallel_4.4.0           
##  [41] R6_2.5.1                  bslib_0.7.0              
##  [43] limma_3.60.0              boot_1.3-30              
##  [45] jquerylib_0.1.4           Rcpp_1.0.12              
##  [47] bookdown_0.39             knitr_1.46               
##  [49] R.utils_2.12.3            tidyselect_1.2.1         
##  [51] Matrix_1.7-0              abind_1.4-5              
##  [53] yaml_2.3.8                EBImage_4.46.0           
##  [55] codetools_0.2-20          curl_5.2.1               
##  [57] tibble_3.2.1              lattice_0.22-6           
##  [59] withr_3.0.0               KEGGREST_1.44.0          
##  [61] evaluate_0.23             sf_1.0-16                
##  [63] units_0.8-5               spData_2.3.0             
##  [65] proxy_0.4-27              Biostrings_2.72.0        
##  [67] filelock_1.0.3            pillar_1.9.0             
##  [69] BiocManager_1.30.23       KernSmooth_2.23-22       
##  [71] generics_0.1.3            sp_2.1-4                 
##  [73] RCurl_1.98-1.14           BiocVersion_3.19.1       
##  [75] sparseMatrixStats_1.16.0  class_7.3-22             
##  [77] glue_1.7.0                tools_4.4.0              
##  [79] BiocNeighbors_1.22.0      data.table_1.15.4        
##  [81] locfit_1.5-9.9            rhdf5_2.48.0             
##  [83] grid_4.4.0                spdep_1.3-3              
##  [85] AnnotationDbi_1.66.0      DropletUtils_1.24.0      
##  [87] edgeR_4.2.0               GenomeInfoDbData_1.2.12  
##  [89] HDF5Array_1.32.0          cli_3.6.2                
##  [91] rappdirs_0.3.3            fansi_1.0.6              
##  [93] S4Arrays_1.4.0            dplyr_1.1.4              
##  [95] R.methodsS3_1.8.2         zeallot_0.1.0            
##  [97] sass_0.4.9                digest_0.6.35            
##  [99] classInt_0.4-10           SparseArray_1.4.4        
## [101] dqrng_0.4.0               rjson_0.2.21             
## [103] htmlwidgets_1.6.4         memoise_2.0.1            
## [105] htmltools_0.5.8.1         R.oo_1.26.0              
## [107] lifecycle_1.0.4           httr_1.4.7               
## [109] mime_0.12                 statmod_1.5.0            
## [111] bit64_4.0.5