dittoScatterPlot {dittoSeq} | R Documentation |
Show RNAseq data overlayed on a scatter plot
dittoScatterPlot( object, x.var, y.var, color.var = NULL, shape.by = NULL, split.by = NULL, extra.vars = NULL, cells.use = NULL, show.others = FALSE, size = 1, opacity = 1, color.panel = dittoColors(), colors = seq_along(color.panel), split.nrow = NULL, split.ncol = NULL, assay.x = .default_assay(object), slot.x = .default_slot(object), adjustment.x = NULL, assay.y = .default_assay(object), slot.y = .default_slot(object), adjustment.y = NULL, assay.color = .default_assay(object), slot.color = .default_slot(object), adjustment.color = NULL, assay.extra = .default_assay(object), slot.extra = .default_slot(object), adjustment.extra = NULL, do.hover = FALSE, hover.data = NULL, hover.assay = .default_assay(object), hover.slot = .default_slot(object), hover.adjustment = NULL, shape.panel = c(16, 15, 17, 23, 25, 8), rename.color.groups = NULL, rename.shape.groups = NULL, min.color = "#F0E442", max.color = "#0072B2", min = NULL, max = NULL, xlab = x.var, ylab = y.var, main = "make", sub = NULL, theme = theme_bw(), legend.show = TRUE, legend.color.title = color.var, legend.color.size = 5, legend.color.breaks = waiver(), legend.color.breaks.labels = waiver(), legend.shape.title = shape.by, legend.shape.size = 5, data.out = FALSE )
object |
A Seurat or SingleCellExperiment object |
x.var, y.var |
Single string giving a gene or metadata that will be used for the x- and y-axis of the scatterplot. Note: must be continuous. Alternatively, can be a directly supplied numeric vector of length equal to the total number of cells/samples in |
color.var |
Single string giving a gene or metadata that will set the color of cells/samples in the plot. Alternatively, can be a directly supplied numeric or string, vector or a factor of length equal to the total number of cells/samples in |
shape.by |
Single string giving a metadata (Note: must be discrete.) that will set the shape of cells/samples in the plot. Alternatively, can be a directly supplied string vector or a factor of length equal to the total number of cells/samples in |
split.by |
1 or 2 strings naming discrete metadata to use for splitting the cells/samples into multiple plots with ggplot faceting. When 2 metadatas are named, c(row,col), the first is used as rows and the second is used for columns of the resulting grid. When 1 metadata is named, shape control can be achieved with |
extra.vars |
String vector providing names of any extra metadata to be stashed in the dataframe supplied to Useful for making custom alterations after dittoSeq plot generation. |
cells.use |
String vector of cells'/samples' names which should be included. Alternatively, a Logical vector, the same length as the number of cells in the object, which sets which cells to include.
For the typically easier logical method, provide |
show.others |
Logical. TRUE by default, whether other cells should be shown in the background in light gray. |
size |
Number which sets the size of data points. Default = 1. |
opacity |
Number between 0 and 1. Great for when you have MANY overlapping points, this sets how solid the points should be: 1 = not see-through at all. 0 = invisible. Default = 1. (In terms of typical ggplot variables, = alpha) |
color.panel |
String vector which sets the colors to draw from. |
colors |
Integer vector, the indexes / order, of colors from color.panel to actually use |
split.nrow, split.ncol |
Integers which set the dimensions of faceting/splitting when a single metadata is given to |
assay.x, assay.y, assay.color, assay.extra, slot.x, slot.y, slot.color, slot.extra, adjustment.x, adjustment.y, adjustment.color, adjustment.extra |
assay, slot, and adjustment set which data to use when the axes, coloring, or |
do.hover |
Logical which controls whether the object will be converted to a plotly object so that data about individual points will be displayed when you hover your cursor over them.
|
hover.data |
String vector of gene and metadata names, example: |
hover.assay, hover.slot, hover.adjustment |
Similar to the x, y, color, and extra versions, when showing expression data upon hover, these set what data will be shown. |
shape.panel |
Vector of integers corresponding to ggplot shapes which sets what shapes to use.
When discrete groupings are supplied by Note: Unfortunately, shapes can be hard to see when points are on top of each other & they are more slowly processed by the brain. For these reasons, even as a color blind person myself writing this code, I recommend use of colors for variables with many discrete values. |
rename.color.groups, rename.shape.groups |
String vector containing new names for the identities of the color or shape overlay groups. |
min.color |
color for lowest values of var/min. Default = yellow |
max.color |
color for highest values of var/max. Default = blue |
min, max |
Numbers which set the values associated with the minimum and maximum colors. |
xlab, ylab |
Strings which set the labels for the axes. To remove, set to |
main |
String, sets the plot title.
A default title is automatically generated if based on |
sub |
String, sets the plot subtitle. |
theme |
A ggplot theme which will be applied before dittoSeq adjustments.
Default = |
legend.show |
Logical. Whether any legend should be displayed. Default = |
legend.color.title, legend.shape.title |
Strings which set the title for the color or shape legends. |
legend.color.size, legend.shape.size |
Numbers representing the size at which shapes should be plotted in the color and shape legends (for discrete variable plotting). Default = 5. *Enlarging the icons in the colors legend is incredibly helpful for making colors more distinguishable by color blind individuals. |
legend.color.breaks |
Numeric vector which sets the discrete values to show in the color-scale legend for continuous data. |
legend.color.breaks.labels |
String vector, with same length as |
data.out |
Logical. When set to Note: |
This function creates a dataframe with X, Y, color, shape, and faceting data determined by x.var
, y.var
, color.var
, shape.var
, and split.by
.
Any extra gene or metadata requested with extra.var
is added as well.
For expression/counts data, assay
, slot
, and adjustment
inputs (.x
, .y
, and .color
) can be used to change which data is used, and if it should be adjusted in some way.
Next, if a set of cells or samples to use is indicated with the cells.use
input, then the dataframe is split into Target_data
and Others_data
based on subsetting by the target cells/samples.
Finally, a scatter plot is created using these dataframes.
Non-target cells are colored in gray if show.others=TRUE
,
and target cell data is displayed on top, colored and shaped based on the color.var
- and shape.by
-associated data.
If split.by
was used, the plot will be split into a matrix of panels based on the associated groupings.
a ggplot scatterplot where colored dots and/or shapes represent individual cells/samples. X and Y axes can be gene expression, numeric metadata, or manually supplied values.
Alternatively, if data.out=TRUE
, a list containing three slots is output: the plot (named 'p'), a data.table containing the underlying data for target cells (named 'Target_data'), and a data.table containing the underlying data for non-target cells (named 'Others_data').
Alternatively, if do.hover
is set to TRUE
, the plot is coverted from ggplot to plotly &
cell/sample information, determined by the hover.data
input, is retrieved, added to the dataframe, and displayed upon hovering the cursor over the plot.
size
and opacity
can be used to adjust the size and transparency of the data points.
Colors used can be adjusted with color.panel
and/or colors
for discrete data, or min
, max
, min.color
, and max.color
for continuous data.
Shapes used can be adjusted with shape.panel
.
Color and shape labels can be changed using rename.color.groups
and rename.shape.groups
.
Titles and axes labels can be adjusted with main
, sub
, xlab
, ylab
, and legend.title
arguments.
Legends can also be adjusted in other ways, using variables that all start with "legend.
" for easy tab completion lookup.
Daniel Bunis
getGenes
and getMetas
to see what the x.var
, y.var
, color.var
, shape.by
, and hover.data
options are.
dittoDimPlot
for making very similar data representations, but where dimensionality reduction (PCA, t-SNE, UMAP, etc.) dimensions are the scatterplot axes.
# dittoSeq handles bulk and single-cell data quit similarly. # The SingleCellExperiment object structure is used for both, # but all functions can be used similarly directly on Seurat # objects as well. example(importDittoBulk, echo = FALSE) myRNA # Mock up some nCount_RNA and nFeature_RNA metadata # == the default way to extract myRNA$nCount_RNA <- runif(60,200,1000) myRNA$nFeature_RNA <- myRNA$nCount_RNA*runif(60,0.95,1.05) # and also percent.mito metadata myRNA$percent.mito <- sample(c(runif(50,0,0.05),runif(10,0.05,0.2))) dittoScatterPlot( myRNA, x.var = "nCount_RNA", y.var = "nFeature_RNA") # Shapes or colors can be overlaid representing discrete metadata # or (only colors) continuous metadata / expression data by providing # metadata or gene names to 'color.var' and 'shape.by' dittoScatterPlot( myRNA, x.var = "nCount_RNA", y.var = "nFeature_RNA", color.var = "percent.mito") dittoScatterPlot( myRNA, x.var = "gene1", y.var = "gene2", color.var = "groups", shape.by = "SNP", size = 3) dittoScatterPlot( myRNA, x.var = "gene1", y.var = "gene2", color.var = "gene3") # Data can be "split" or faceted by a discrete variable as well. dittoScatterPlot( myRNA, x.var = "gene1", y.var = "gene2", split.by = "timepoint") # single split.by element dittoScatterPlot( myRNA, x.var = "gene1", y.var = "gene2", split.by = c("groups","SNP")) # row and col split.by elements # OR with 'extra.vars' plus manually faceting for added control dittoDimPlot(myRNA, "gene1", extra.vars = c("SNP")) + facet_wrap("SNP", ncol = 1, strip.position = "left") # Note: scatterplots like this can be very useful for dataset QC, especially # with percentage of reads coming from genes as the color overlay.