sub_data {spatialHeatmap}R Documentation

Subset Target Data for Spatial Enrichment

Description

This function subsets the target spatial features (e.g. cells, tissues, organs) and factors (e.g. experimental treatments, time points) for the subsequent spatial enrichment.

Usage

sub_data(
  data,
  feature,
  features = NULL,
  factor,
  factors = NULL,
  com.by = "feature",
  target = NULL
)

Arguments

data

A SummarizedExperiment object. The colData slot is required to contain at least two columns of "features" and "factors" respectively. The rowData slot can optionally contain a column of discriptions of each gene and the column name should be metadata.

feature

The column name of "features" in the colData slot.

features

A vector of at least two selected features for spatial enrichment, which come from the feature column. The default is NULL and the first two features will be selected. If all, then all features will be selected.

factor

The column name of "factors" in the colData slot.

factors

A vector of at least two selected factors for spatial enrichment, which come from the factor column. The default is NULL and the first two factors will be selected. If all, then all factors will be selected.

com.by

One of feature, factor, feature.factor. If feature, pairwise comparisons will be perfomed between the selected features and the factors will be treated as replicates. If factor, pairwise comparisons will be perfomed between the selected factors and the features will be treated as replicates. If feature.factor, the selected features and factors will be concatenated by __ and pairwise comparisons will be perfomed between the "feature__factor" entities. The default is feature. The corresponding column will be moved to the first in the colData slot and be recognized in the spatial enrichment process.

target

A single-component vector of the target for spatial enrichment. If com.by='feature', the target will be one of the entries in features. If com.by='factor', the target will be one of the entries in factors. If com.by='feature.factor', the target will be one of the concatenated features and factors. E.g. features=c('brain', 'kidney'), factors=c('control', 'drug'), the target could be one of c('brain__control', 'brain__drug', 'kidney__control', 'kidney__drug'). The default is NULL, and the first entity in features is selected, since the default com.by is feature. A target column will be included in the colData slot and will be recognized in spatial enrichment.

Value

A subsetted SummarizedExperiment object.

Author(s)

Jianhai Zhang jzhan067@ucr.edu; zhang.jianhai@hotmail.com
Dr. Thomas Girke thomas.girke@ucr.edu

References

Cardoso-Moreira, Margarida, Jean Halbert, Delphine Valloton, Britta Velten, Chunyan Chen, Yi Shao, Angélica Liechti, et al. 2019. “Gene Expression Across Mammalian Organ Development.” Nature 571 (7766): 505–9
Keays, Maria. 2019. ExpressionAtlas: Download Datasets from EMBL-EBI Expression Atlas
Martin Morgan, Valerie Obenchain, Jim Hester and Hervé Pagès (2018). SummarizedExperiment: SummarizedExperiment container. R package version 1.10.1

Examples

## In the following examples, the toy data come from an RNA-seq analysis on development of 7
## chicken organs under 9 time points (Cardoso-Moreira et al. 2019). For conveninece, it is
## included in this package. The complete raw count data are downloaded using the R package
## ExpressionAtlas (Keays 2019) with the accession number "E-MTAB-6769".   

## Set up toy data.

# Access toy data. 
cnt.chk <- system.file('extdata/shinyApp/example/count_chicken.txt', package='spatialHeatmap')
count.chk <- read.table(cnt.chk, header=TRUE, row.names=1, sep='\t')
count.chk[1:3, 1:5]

# A targets file describing samples and conditions is required for toy data. It should be made
# based on the experiment design, which is accessible through the accession number 
# "E-MTAB-6769" in the R package ExpressionAtlas. An example targets file is included in this
# package and accessed below. 
# Access the count table. 
cnt.chk <- system.file('extdata/shinyApp/example/count_chicken.txt', package='spatialHeatmap')
count.chk <- read.table(cnt.chk, header=TRUE, row.names=1, sep='\t')
count.chk[1:3, 1:5]
# Access the example targets file. 
tar.chk <- system.file('extdata/shinyApp/example/target_chicken.txt', package='spatialHeatmap')
target.chk <- read.table(tar.chk, header=TRUE, row.names=1, sep='\t')
# Every column in toy data corresponds with a row in targets file. 
target.chk[1:5, ]
# Store toy data in "SummarizedExperiment".
library(SummarizedExperiment)
se.chk <- SummarizedExperiment(assay=count.chk, colData=target.chk)
# The "rowData" slot can store a data frame of gene metadata, but not required. Only the 
# column named "metadata" will be recognized. 
# Pseudo row metadata.
metadata <- paste0('meta', seq_len(nrow(count.chk))); metadata[1:3]
rowData(se.chk) <- DataFrame(metadata=metadata)

## As conventions, raw sequencing count data should be normalized and filtered to
## reduce noise. Since normalization will be performed in spatial enrichment, only filtering
## is required before subsetting the data.  

# Filter out genes with low counts and low variance. Genes with counts over 5 in
# at least 10% samples (pOA), and coefficient of variance (CV) between 3.5 and 100 are 
# retained.
se.fil.chk <- filter_data(data=se.chk, sam.factor='organism_part', con.factor='age',
pOA=c(0.1, 5), CV=c(3.5, 100), dir=NULL)
# Subset the data.
data.sub <- sub_data(data=se.fil.chk, feature='organism_part', features=c('brain', 'heart',
'kidney'), factor='age', factors=c('day10', 'day12'), com.by='feature', target='brain')

[Package spatialHeatmap version 2.0.0 Index]