dittoScatterPlot {dittoSeq}R Documentation

Show RNAseq data overlayed on a scatter plot

Description

Show RNAseq data overlayed on a scatter plot

Usage

dittoScatterPlot(
  object,
  x.var,
  y.var,
  color.var = NULL,
  shape.by = NULL,
  split.by = NULL,
  extra.vars = NULL,
  cells.use = NULL,
  show.others = FALSE,
  size = 1,
  opacity = 1,
  color.panel = dittoColors(),
  colors = seq_along(color.panel),
  split.nrow = NULL,
  split.ncol = NULL,
  assay.x = .default_assay(object),
  slot.x = .default_slot(object),
  adjustment.x = NULL,
  assay.y = .default_assay(object),
  slot.y = .default_slot(object),
  adjustment.y = NULL,
  assay.color = .default_assay(object),
  slot.color = .default_slot(object),
  adjustment.color = NULL,
  assay.extra = .default_assay(object),
  slot.extra = .default_slot(object),
  adjustment.extra = NULL,
  do.hover = FALSE,
  hover.data = NULL,
  hover.assay = .default_assay(object),
  hover.slot = .default_slot(object),
  hover.adjustment = NULL,
  shape.panel = c(16, 15, 17, 23, 25, 8),
  rename.color.groups = NULL,
  rename.shape.groups = NULL,
  min.color = "#F0E442",
  max.color = "#0072B2",
  min = NULL,
  max = NULL,
  xlab = x.var,
  ylab = y.var,
  main = "make",
  sub = NULL,
  theme = theme_bw(),
  legend.show = TRUE,
  legend.color.title = color.var,
  legend.color.size = 5,
  legend.color.breaks = waiver(),
  legend.color.breaks.labels = waiver(),
  legend.shape.title = shape.by,
  legend.shape.size = 5,
  data.out = FALSE
)

Arguments

object

A Seurat or SingleCellExperiment object

x.var, y.var

Single string giving a gene or metadata that will be used for the x- and y-axis of the scatterplot. Note: must be continuous.

Alternatively, can be a directly supplied numeric vector of length equal to the total number of cells/samples in object.

color.var

Single string giving a gene or metadata that will set the color of cells/samples in the plot.

Alternatively, can be a directly supplied numeric or string, vector or a factor of length equal to the total number of cells/samples in object.

shape.by

Single string giving a metadata (Note: must be discrete.) that will set the shape of cells/samples in the plot.

Alternatively, can be a directly supplied string vector or a factor of length equal to the total number of cells/samples in object.

split.by

1 or 2 strings naming discrete metadata to use for splitting the cells/samples into multiple plots with ggplot faceting.

When 2 metadatas are named, c(row,col), the first is used as rows and the second is used for columns of the resulting grid.

When 1 metadata is named, shape control can be achieved with split.nrow and split.ncol

extra.vars

String vector providing names of any extra metadata to be stashed in the dataframe supplied to ggplot(data).

Useful for making custom alterations after dittoSeq plot generation.

cells.use

String vector of cells'/samples' names which should be included.

Alternatively, a Logical vector, the same length as the number of cells in the object, which sets which cells to include. For the typically easier logical method, provide USE in object@cell.names[USE] OR colnames(object)[USE]).

show.others

Logical. TRUE by default, whether other cells should be shown in the background in light gray.

size

Number which sets the size of data points. Default = 1.

opacity

Number between 0 and 1. Great for when you have MANY overlapping points, this sets how solid the points should be: 1 = not see-through at all. 0 = invisible. Default = 1. (In terms of typical ggplot variables, = alpha)

color.panel

String vector which sets the colors to draw from. dittoColors() by default, see dittoColors for contents.

colors

Integer vector, the indexes / order, of colors from color.panel to actually use

split.nrow, split.ncol

Integers which set the dimensions of faceting/splitting when a single metadata is given to split.by.

assay.x, assay.y, assay.color, assay.extra, slot.x, slot.y, slot.color, slot.extra, adjustment.x, adjustment.y, adjustment.color, adjustment.extra

assay, slot, and adjustment set which data to use when the axes, coloring, or extra.vars are based on expression/counts data. See gene for additional information.

do.hover

Logical which controls whether the object will be converted to a plotly object so that data about individual points will be displayed when you hover your cursor over them. hover.data argument is used to determine what data to use.

hover.data

String vector of gene and metadata names, example: c("meta1","gene1","meta2","gene2") which determines what data to show on hover when do.hover is set to TRUE.

hover.assay, hover.slot, hover.adjustment

Similar to the x, y, color, and extra versions, when showing expression data upon hover, these set what data will be shown.

shape.panel

Vector of integers corresponding to ggplot shapes which sets what shapes to use. When discrete groupings are supplied by shape.by, this sets the panel of shapes. When nothing is supplied to shape.by, only the first value is used. Default is a set of 6, c(16,15,17,23,25,8), the first being a simple, solid, circle.

Note: Unfortunately, shapes can be hard to see when points are on top of each other & they are more slowly processed by the brain. For these reasons, even as a color blind person myself writing this code, I recommend use of colors for variables with many discrete values.

rename.color.groups, rename.shape.groups

String vector containing new names for the identities of the color or shape overlay groups.

min.color

color for lowest values of var/min. Default = yellow

max.color

color for highest values of var/max. Default = blue

min, max

Numbers which set the values associated with the minimum and maximum colors.

xlab, ylab

Strings which set the labels for the axes. To remove, set to NULL.

main

String, sets the plot title. A default title is automatically generated if based on color.var and shape.by when either are provided. To remove, set to NULL.

sub

String, sets the plot subtitle.

theme

A ggplot theme which will be applied before dittoSeq adjustments. Default = theme_bw(). See https://ggplot2.tidyverse.org/reference/ggtheme.html for other options and ideas.

legend.show

Logical. Whether any legend should be displayed. Default = TRUE.

legend.color.title, legend.shape.title

Strings which set the title for the color or shape legends.

legend.color.size, legend.shape.size

Numbers representing the size at which shapes should be plotted in the color and shape legends (for discrete variable plotting). Default = 5. *Enlarging the icons in the colors legend is incredibly helpful for making colors more distinguishable by color blind individuals.

legend.color.breaks

Numeric vector which sets the discrete values to show in the color-scale legend for continuous data.

legend.color.breaks.labels

String vector, with same length as legend.breaks, which renames what's displayed next to the tick marks of the color-scale.

data.out

Logical. When set to TRUE, changes the output, from the plot alone, to a list containing the plot ("p"), a data.frame containing the underlying data for target cells ("Target_data"), and a data.frame containing the underlying data for non-target cells ("Others_data").

Note: do.hover plotly conversion is turned off in this setting, but hover.data is still calculated.

Details

This function creates a dataframe with X, Y, color, shape, and faceting data determined by x.var, y.var, color.var, shape.var, and split.by. Any extra gene or metadata requested with extra.var is added as well. For expression/counts data, assay, slot, and adjustment inputs (.x, .y, and .color) can be used to change which data is used, and if it should be adjusted in some way.

Next, if a set of cells or samples to use is indicated with the cells.use input, then the dataframe is split into Target_data and Others_data based on subsetting by the target cells/samples.

Finally, a scatter plot is created using these dataframes. Non-target cells are colored in gray if show.others=TRUE, and target cell data is displayed on top, colored and shaped based on the color.var- and shape.by-associated data. If split.by was used, the plot will be split into a matrix of panels based on the associated groupings.

Value

a ggplot scatterplot where colored dots and/or shapes represent individual cells/samples. X and Y axes can be gene expression, numeric metadata, or manually supplied values.

Alternatively, if data.out=TRUE, a list containing three slots is output: the plot (named 'p'), a data.table containing the underlying data for target cells (named 'Target_data'), and a data.table containing the underlying data for non-target cells (named 'Others_data').

Alternatively, if do.hover is set to TRUE, the plot is coverted from ggplot to plotly & cell/sample information, determined by the hover.data input, is retrieved, added to the dataframe, and displayed upon hovering the cursor over the plot.

Many characteristics of the plot can be adjusted using discrete inputs

Author(s)

Daniel Bunis

See Also

getGenes and getMetas to see what the x.var, y.var, color.var, shape.by, and hover.data options are.

dittoDimPlot for making very similar data representations, but where dimensionality reduction (PCA, t-SNE, UMAP, etc.) dimensions are the scatterplot axes.

Examples

# dittoSeq handles bulk and single-cell data quit similarly.
# The SingleCellExperiment object structure is used for both,
# but all functions can be used similarly directly on Seurat
# objects as well.

example(importDittoBulk, echo = FALSE)
myRNA

# Mock up some nCount_RNA and nFeature_RNA metadata
#  == the default way to extract
myRNA$nCount_RNA <- runif(60,200,1000)
myRNA$nFeature_RNA <- myRNA$nCount_RNA*runif(60,0.95,1.05)
# and also percent.mito metadata
myRNA$percent.mito <- sample(c(runif(50,0,0.05),runif(10,0.05,0.2)))

dittoScatterPlot(
    myRNA, x.var = "nCount_RNA", y.var = "nFeature_RNA")

# Shapes or colors can be overlaid representing discrete metadata
#   or (only colors) continuous metadata / expression data by providing
#   metadata or gene names to 'color.var' and 'shape.by'
dittoScatterPlot(
    myRNA, x.var = "nCount_RNA", y.var = "nFeature_RNA",
    color.var = "percent.mito")
dittoScatterPlot(
    myRNA, x.var = "gene1", y.var = "gene2",
    color.var = "groups",
    shape.by = "SNP",
    size = 3)
dittoScatterPlot(
    myRNA, x.var = "gene1", y.var = "gene2",
    color.var = "gene3")

# Data can be "split" or faceted by a discrete variable as well.
dittoScatterPlot(
    myRNA, x.var = "gene1",
    y.var = "gene2",
    split.by = "timepoint") # single split.by element
dittoScatterPlot(
    myRNA, x.var = "gene1",
    y.var = "gene2",
    split.by = c("groups","SNP")) # row and col split.by elements
# OR with 'extra.vars' plus manually faceting for added control
dittoDimPlot(myRNA, "gene1",
    extra.vars = c("SNP")) +
    facet_wrap("SNP", ncol = 1, strip.position = "left")

# Note: scatterplots like this can be very useful for dataset QC, especially
#   with percentage of reads coming from genes as the color overlay.

[Package dittoSeq version 1.0.2 Index]