GenomicOZone {GenomicOZone}R Documentation

Delineate outstanding genomic zones

Description

Delineate outstanding genomic zones along chromosomes such that genes within an outstanding zone have consistent activity patterns that are different across samples.

Usage

  GenomicOZone(GOZ.ds)

Arguments

GOZ.ds

an object created by function GOZDataSet.

Details

This is the most important function of the package. It integrates genome annotation, gene activity matrix preprocessing, chromosome clustering, and differential zone analysis.

Genome annotation can be specified either by the user, or obtained from the R package biomaRt (Smedley et al. 2015) to access ensembl annotation databases (Zerbino et al. 2017).

The function calls the weighted univariate clustering method (Wang and Song 2011) implemented in the Ckmeans.1d.dp package. If ks is specified, a fixed number of zones at each chromosome will be delineated. If ks is NULL, an optimal number of clusters at each chromosome will be determined by Bayesian information criterion.

The function also conducts differential zone analysis by using one-way ANOVA (Chambers et al. 1992) based on gene ranks. Given p-value cutoff alpha and effect size threshold min.effect.size, outstanding genomic zones will be selected.

Advanced differential zone analysis such as generalized linear modeling can be performed on the zone activity matrix using third-party software. The zone activity matrix can be generated by auxiliary functions.

Value

an object which is the input object attached with intermediate and final results. Results can be accessed by calling several functions in extract_outputs. The results can be visualized by calling the functions in generate_plots.

References

Chambers JM, Hastie TJ, others (1992). “Statistical models in S.” In volume 251, chapter 5. Wadsworth & Brooks/Cole Advanced Books & Software Pacific Grove, CA.

Smedley D, Haider S, Durinck S, Pandini L, Provero P, Allen J, Arnaiz O, Awedh MH, Baldock R, Barbiera G, others (2015). “The BioMart community portal: an innovative alternative to large, centralized data repositories.” Nucleic acids research, 43(W1), W589–W598.

Wang H, Song M (2011). “Ckmeans. 1d. dp: optimal k-means clustering in one dimension by dynamic programming.” The R journal, 3(2), 29.

Zerbino DR, Achuthan P, Akanni W, Amode MR, Barrell D, Bhai J, Billis K, Cummins C, Gall A, Girón CG, others (2017). “Ensembl 2018.” Nucleic acids research, 46(D1), D754–D761.

See Also

See GOZDataSet for how to create the input list. See extract_outputs and generate_plots for how to access the results and generate visualizations.

Examples

  # Create an example of GOZ.ds
  data <- matrix(c(1,5,2,6,5,1,6,2), ncol = 2, byrow = TRUE)
  rownames(data) <- paste("Gene", 1:4, sep='')
  colnames(data) <- paste("Sample", c(1:2), sep='')

  colData <- data.frame(Sample_name = paste("Sample", c(1:2), sep=''),
                        Condition = c("Cancer", "Normal"))

  design <- ~ Condition

  rowData.GRanges <- GRanges(seqnames = Rle(rep("chr1", 4)),
                             ranges = IRanges(start = c(1,2,3,4), end = c(5,6,7,8)))
  names(rowData.GRanges) <- paste("Gene", 1:4, sep='')

  ks <- c(2)
  names(ks) <- "chr1"

  GOZ.ds <- GOZDataSet(data, colData, design,
                       rowData.GRanges = rowData.GRanges,
                       ks = ks)
  ####

  # Run the zoing process
  GOZ.ds <- GenomicOZone(GOZ.ds)
  ####

[Package GenomicOZone version 1.0.0 Index]