dittoPlotVarsAcrossGroups {dittoSeq}R Documentation

Generates a dittoPlot where datapoints are genes/metadata summarizes per groups instead of individual values per cells/samples.

Description

Generates a dittoPlot where datapoints are genes/metadata summarizes per groups instead of individual values per cells/samples.

Usage

dittoPlotVarsAcrossGroups(
  object,
  vars,
  group.by,
  color.by = group.by,
  summary.fxn = mean,
  cells.use = NULL,
  plots = c("vlnplot", "jitter"),
  assay = .default_assay(object),
  slot = .default_slot(object),
  adjustment = "z-score",
  do.hover = FALSE,
  main = NULL,
  sub = NULL,
  ylab = "make",
  y.breaks = NULL,
  min = NULL,
  max = NULL,
  xlab = group.by,
  x.labels = NULL,
  x.labels.rotate = NA,
  x.reorder = NULL,
  color.panel = dittoColors(),
  colors = c(seq_along(color.panel)),
  theme = theme_classic(),
  jitter.size = 1,
  jitter.width = 0.2,
  jitter.color = "black",
  boxplot.width = 0.2,
  boxplot.color = "black",
  boxplot.show.outliers = NA,
  boxplot.fill = TRUE,
  vlnplot.lineweight = 1,
  vlnplot.width = 1,
  vlnplot.scaling = "area",
  ridgeplot.lineweight = 1,
  ridgeplot.scale = 1.25,
  add.line = NULL,
  line.linetype = "dashed",
  line.color = "black",
  legend.show = TRUE,
  legend.title = NULL,
  data.out = FALSE
)

Arguments

object

A Seurat or SingleCellExperiment object

vars

String vector (example: c("gene1","gene2","gene3")) which selects which variables, typically genes, to extract from the object, summarize across groups, and add to the plot

group.by

String representing the name of a metadata to use for separating the cells/samples into discrete groups.

color.by

String representing the name of a metadata to use for setting fills. Great for highlighting subgroups when wanted, but it defaults to group.by so this input can be skipped otherwise. Affects boxplot, vlnplot, and ridgeplot fills.

summary.fxn

A function which sets how variables' data will be summarized accross the groups. Default is mean, which will take the average value, but any function can be used as long as it takes in a numeric vector and returns a single numeric value. Alternative examles: median, max, function (x) sum(x!=0)/length(x).

cells.use

String vector of cells'/samples' names which should be included. Alternatively, a Logical vector, the same length as the number of cells in the object, which sets which cells to include. For the typically easier logical method, provide USE in colnames(object)[USE] OR object@cell.names[USE].

plots

String vector which sets the types of plots to include: possibilities = "jitter", "boxplot", "vlnplot", "ridgeplot". Order matters: c("vlnplot", "boxplot", "jitter") will put a violin plot in the back, boxplot in the middle, and then individual dots in the front. See details section for more info.

assay, slot

single strings or integer that set which data to use when plotting expressin data. See gene for more information about how defaults for these are filled in when not provided.

adjustment

When plotting gene expression (or antibody, or other forms of counts data), should that data be used directly or should it be adjusted to be

  • "z-score": DEFAULT, scaled with the scale() function to produce a relative-to-mean z-score representation

  • NULL: no adjustment, the normal method for all other ditto expression plotting

  • "relative.to.max": divided by the maximum expression value to give percent of max values between [0,1]

do.hover

Logical. Default = FALSE. If set to TRUE the object will be converted to a ggplotly object so that data about individual points will be displayed when you hover your cursor over them. The hover data works best for jitter data representations, so it is recommended to have "jitter" as the last value of the plots input when running using hover.

Note: Currently, incompatible with RidgePlots as plotly does not support the geom.

main

String which sets the plot title.

sub

String which sets the plot subtitle.

ylab

String which sets the y axis label. Default = a combination of then name of the summary function + adjustment + "expression". Set to NULL to remove.

y.breaks

Numeric vector, a set of breaks that should be used as major gridlines. c(break1,break2,break3,etc.).

min, max

Scalars which control the zoom of the plot. These inputs set the minimum / maximum values of the data to show. Default = set based on the limits of the data in var.

xlab

String which sets the grouping-axis label (=x-axis for box and violin plots, y-axis for ridgeplots). Default is group.by so it defaults to the name of the grouping information. Set to NULL to remove.

x.labels

String vector, c("label1","label2","label3",...) which overrides the names of the samples/groups. NOTE: you need to give at least as many labels as there are discrete values in the group.by data.

x.labels.rotate

Logical which sets whether the labels should be rotated. Default: TRUE for violin and box plots, but FALSE for ridgeplots.

x.reorder

Integer vector. A sequence of numbers, from 1 to the number of groupings, for rearranging the order of x-axis groupings.

Method: Make a first plot without this input. Then, treating the leftmost grouping as index 1, and the rightmost as index n. Values of x.reorder should be these indices, but in the order that you would like them rearranged to be.

color.panel

String vector which sets the colors to draw from for plot fills. Default = dittoColors().

colors

Integer vector, the indexes / order, of colors from color.panel to actually use. (Provides an alternative to directly modifying color.panel.)

theme

A ggplot theme which will be applied before dittoSeq adjustments. Default = theme_classic(). See https://ggplot2.tidyverse.org/reference/ggtheme.html for other options and ideas.

jitter.size

Scalar which sets the size of the jitter shapes.

jitter.width

Scalar that sets the width/spread of the jitter in the x direction. Ignored in ridgeplots.

jitter.color

String which sets the color of the jitter shapes

boxplot.width

Scalar which sets the width/spread of the boxplot in the x direction

boxplot.color

String which sets the color of the lines of the boxplot

boxplot.show.outliers

Logical, whether outliers should by including in the boxplot. Default is FALSE when there is a jitter plotted, TRUE if there is no jitter.

boxplot.fill

Logical, whether the boxplot should be filled in or not. Known bug: when boxplot fill is turned off, outliers do not render.

vlnplot.lineweight

Scalar which sets the thickness of the line that outlines the violin plots.

vlnplot.width

Scalar which sets the width/spread of the jitter in the x direction

vlnplot.scaling

String which sets how the widths of the of violin plots are set in relation to eachother. Options are "area", "count", and "width". If the deafult is not right for your data, I recommend trying "width". For a detailed explanation of each, see geom_violin.

ridgeplot.lineweight

Scalar which sets the thickness of the ridgeplot outline.

ridgeplot.scale

Scalar which sets the distance/overlap between ridgeplots. A value of 1 means the tallest density curve just touches the baseline of the next higher one. Higher numbers lead to greater overlap. Default = 1.25

add.line

numeric value(s) where one or multiple line should be added

line.linetype

String which sets the type of line for add.line. Defaults to "dashed", but any ggplot linetype will work.

line.color

String that sets the color(s) of the add.line line(s)

legend.show

Logical. Whether the legend should be displayed. Default = TRUE.

legend.title

String or NULL, sets the title for the main legend which includes colors and data representations. This input is set to NULL by default.

data.out

Logical. When set to TRUE, changes the output, from the plot alone, to a list containing the plot (p) and data (data).

Note: plotly conversion is turned off in the data.out = TRUE setting, but hover.data is still calculated.

Details

Generally, this function will output a dittoPlot, grouped by sample, age, cluster, etc., where each data point represents the summary (typically mean), accross each group, of individual variable's expression, but variables can be genes or metadata.

The data for each element of vars is obtained. When elements are genes/features, assay and slot are utilized to determine which expression data to use, and adjustment determines if and how the expression data might be adjusted.

By default, a z-score adjustment is applied to all gene/feature vars. Note that this adjustment is applied before cells/samples subsetting.

x-axis groupings are then determined using group.by, and data for each variable is summarized using the summary.fxn.

Finally, data is plotted with the data representation types in plots.

Value

a ggplot or plotly where continuous data, grouped by sample, age, cluster, etc., shown on either the y-axis by a violin plot, boxplot, and/or jittered points, or on the x-axis by a ridgeplot with or without jittered points.

Alternatively when data.out=TRUE, a list containing the plot ("p") and the underlying data as a dataframe ("data").

Alternatively when do.hover = TRUE, a plotly converted version of the plot where additional data will be displayed when the cursor is hovered over jitter points.

Plot Customization

The plots argument determines the types of data representation that will be generated, as well as their order from back to front. Options are "jitter", "boxplot", "vlnplot", and "ridgeplot". Each plot type has specific associated options which are controlled by variables that start with their associated string, ex: jitter.size.

Inclusion of "ridgeplot" overrides boxplot and violin plot and changes the plot to be horizontal.

Author(s)

Daniel Bunis

See Also

dittoPlot and multi_dittoPlot for plotting of single or mutliple expression and metadata vars, each as separate plots, on a per cell/sample basis.

Examples

# dittoSeq handles bulk and single-cell data quit similarly.
# The SingleCellExperiment object structure is used for both,
# but all functions can be used similarly directly on Seurat
# objects as well.

##########
### Generate some random data
##########
# Zero-inflated Expression
nsamples <- 60
exp <- rpois(1000*nsamples, 20)
exp[sample(c(TRUE,TRUE,FALSE),1000*nsamples, TRUE)] <- 0
exp <- matrix(exp, ncol=nsamples)
colnames(exp) <- paste0("sample", seq_len(ncol(exp)))
rownames(exp) <- paste0("gene", seq_len(nrow(exp)))
logexp <- log2(exp + 1)

# Metadata
conds <- factor(rep(c("condition1", "condition2"), each=nsamples/2))
timept <- rep(c("d0", "d3", "d6", "d9"), each = 15)
genome <- rep(c(rep(TRUE,7),rep(FALSE,8)), 4)
grps <- sample(c("A","B","C","D"), nsamples, TRUE)

# We can add these directly during import, or after.
myscRNA <- importDittoBulk(x = list(counts = exp, logcounts = logexp),
    metadata = data.frame(conditions = conds, timepoint = timept,
        SNP = genome, groups = grps))

# Pick a set of genes
genes <- getGenes(myscRNA)[1:30]

dittoPlotVarsAcrossGroups(
    myscRNA, genes, group.by = "timepoint")

# Color can be controlled separately from grouping with 'color.by'
#   Just note: all groupings must map to a single color.
dittoPlotVarsAcrossGroups(myscRNA, genes, "timepoint",
    color.by = "conditions")

# To change it to have the violin plot in the back, a jitter on
#  top of that, and a white boxplot with no fill in front:
dittoPlotVarsAcrossGroups(myscRNA, genes, "timepoint", "conditions",
    plots = c("vlnplot","jitter","boxplot"),
    boxplot.color = "white", boxplot.fill = FALSE)

## Data can be summaryized in other ways by changing the summary.fxn input.
#  Often, it makes sense to turn off the z-score adjustment in such cases.
#  median
dittoPlotVarsAcrossGroups(myscRNA, genes, "timepoint", "conditions",
    summary.fxn = median,
    adjustment = NULL)
#  Percent non-zero expression
percent <- function(x) {sum(x!=0)/length(x)}
dittoPlotVarsAcrossGroups(myscRNA, genes, "timepoint", "conditions",
    summary.fxn = percent,
    adjustment = NULL)

# To investigate the identities of outlier genes, we can turn on hovering
# (if the plotly package is available)
if (requireNamespace("plotly", quietly = TRUE)) {
    dittoPlotVarsAcrossGroups(
        myscRNA, genes, "timepoint", "conditions",
        do.hover = TRUE)
}


[Package dittoSeq version 1.0.2 Index]