SingleCellSignalR

User’s Guide

Simon Cabello-Aguilar1, Jacques Colinge1

1 Institut de Recherche en Cancérologie de Montpellier, Inserm, Montpellier, France ; Institut régional du Cancer Montpellier, Montpellier, France ; Université de Montpellier, Montpellier, France


Introduction

This guide provides an overview of the SingleCellSignalR package, a comprehensive framework to obtain cellular network maps from scRNA-seq data. SingleCellSignalR comes with a complete pipeline integrating existing methods to cluster individual cell transcriptomes and identify cell subpopulations as well as novel cellular network-specific algorithms. More advanced users can substitute their own logic or alternative tools at various stages of data processing. SingleCellSignalR also maps cell subpopulation internal network linked to genes of interest through the integration of regulated KEGG and Reactome pathways together with ligands and receptors involved in inferred cell-cell interactions. The cellular networks can be exported in text files and graphML objects to be further explored with Cytoscape (www.cytoscape.org), yEd (www.yworks.com), or similar software tools.


Quick Start

Independent of the chosen scRNA-seq platform, deep or shallower, data comes as a table of read or unique molecule identifier (UMI) counts, one column per individual cell and one row per gene. Initial processing is required to prepare such data for subsequent analysis and we decided to propose a generic solution for the sake of convenience, though users can easily substitute their own computations. Gene names (HUGO symbols) are provided in the first column of the table.

Each analysis is organized around a working directory (or project folder):

The file containing the read counts should be placed in the working directory.

Data processing can then start:

The data_prepare() function eliminates non expressed genes before performing read count normalization.

Normalized data are submitted to a clustering algorithm to identify cell subpopulations:

#> 4 clusters detected
#> cluster 1 -> 292 cells
#> cluster 2 -> 1 cells
#> cluster 3 -> 99 cells
#> cluster 4 -> 8 cells

We set the method argument to simlr, which caused the SIMLR() function of the SIMLR package [1] to be used. The SIMLR_Estimate_Number_of_Clusters() function determined the number of clusters, between 2 and n (n=10 above).

Next, differentially expressed genes in one cluster compared to the others are identified using the cluster_analysis() function, which relies on edgeR. A result table is automatically created in the cluster-analysis folder:

Once the preliminary steps illustrated above are completed, SingleCellSignalR can be used to generate cellular interaction lists using the cell_signaling() function:

An intercellular network can also be generated to map the overall ligand/receptor interactions invoking the inter_network() function:

At this point the intercellular network have been generated and exported in text and graphML formats in the networks folder.
A summary of the interactions between cell clusters can be output in the form of a chord diagram by the visualize_interactions() function:

This function will create a plot in the R plot window.

The details of the interactions between two clusters, for example cluster 1 and 2, can also be shown in the plot window with the visualize_interactions() function. Note that in the example below we ask for the display of two pairs of cell clusters, pair 1 that contains interactions from cluster 1 to 2, and pair 4 from cluster 2 to 1. (names(signal) returns the cell cluster names in each pair, see function visualize_interactions() details.)

And these plots can be saved into pdf files in the images folder using the write.in argument of the visualize_interactions() function.

red
red
red


Examples of use

SingleCellSignalR package functions have many arguments parameters that can be changed by the user to fit her needs (see Reference Manual for more details). Furthermore, several handy functions that were not illustrated above are provided to generate additional plots or reports.


Exploiting the cell_classifier clustering

After running the example in the Quick Start section, the user can define cell clusters after the output of the cell_classifier(). The demo data set is comprised of a subset of the 10x PBMC dataset [3], i.e. immune cells. The t-SNE map calculated with the clustering() function will also be used. For this example we will set the plot.details argument to TRUE to monitor the choice of the threshold of gene signature scores.

Let us use the cell clustering obtained with the cell_classifier() function. Although “undefined” cells may be interesting in some cases, here they form a heterogeneous cluster because they represent cells that seem to be in a transition between two states (“T-cells” and “Cytotoxic cells”, or “Neutrophils” and “Macrophages”, see heatmap above). We discard these cells.

Then the analysis can be carried on.

Once the cluster analysis is done, the cell_signaling(), inter_network() functions can be used.

signal <- cell_signaling(data = data, genes = genes, cluster = cluster, c.names = c.names, write = FALSE)
#> No such file as table_dge_T-cells.txt in the cluster-analysis folder
#> No such file as table_dge_B-cells.txt in the cluster-analysis folder
#> No such file as table_dge_Macrophages.txt in the cluster-analysis folder
#> No such file as table_dge_Cytotoxic cells.txt in the cluster-analysis folder
#> No such file as table_dge_Neutrophils.txt in the cluster-analysis folder
#> Paracrine signaling: 
#> Checking for signaling between cell types
#> 10 interactions from T-cells to B-cells
#> 20 interactions from T-cells to Macrophages
#> 20 interactions from T-cells to Cytotoxic cells
#> 24 interactions from T-cells to Neutrophils
#> 5 interactions from B-cells to T-cells
#> 26 interactions from B-cells to Macrophages
#> 19 interactions from B-cells to Cytotoxic cells
#> 29 interactions from B-cells to Neutrophils
#> 2 interactions from Macrophages to T-cells
#> 11 interactions from Macrophages to B-cells
#> 20 interactions from Macrophages to Cytotoxic cells
#> 1 interactions from Macrophages to Neutrophils
#> 2 interactions from Cytotoxic cells to T-cells
#> 4 interactions from Cytotoxic cells to B-cells
#> 21 interactions from Cytotoxic cells to Macrophages
#> 21 interactions from Cytotoxic cells to Neutrophils
#> 7 interactions from Neutrophils to T-cells
#> 12 interactions from Neutrophils to B-cells
#> 7 interactions from Neutrophils to Macrophages
#> 29 interactions from Neutrophils to Cytotoxic cells

inter.net <- inter_network(data = data, signal = signal, genes = genes, cluster = cluster, write = FALSE)
#> Doing T-cells and B-cells ... OK
#> Doing T-cells and Macrophages ... OK
#> Doing T-cells and Cytotoxic cells ... OK
#> Doing T-cells and Neutrophils ... OK
#> Doing B-cells and T-cells ... OK
#> Doing B-cells and Macrophages ... OK
#> Doing B-cells and Cytotoxic cells ... OK
#> Doing B-cells and Neutrophils ... OK
#> Doing Macrophages and T-cells ... OK
#> Doing Macrophages and B-cells ... OK
#> Doing Macrophages and Cytotoxic cells ... OK
#> Doing Macrophages and Neutrophils ... OK
#> Doing Cytotoxic cells and T-cells ... OK
#> Doing Cytotoxic cells and B-cells ... OK
#> Doing Cytotoxic cells and Macrophages ... OK
#> Doing Cytotoxic cells and Neutrophils ... OK
#> Doing Neutrophils and T-cells ... OK
#> Doing Neutrophils and B-cells ... OK
#> Doing Neutrophils and Macrophages ... OK
#> Doing Neutrophils and Cytotoxic cells ... OK

If we take a look at signal[[6]] (or signal[["B-cells-Macrophages"]])

We can be interested in genes participating in pathways with a receptor of interest inside a cluster of interest. Let us say ASGR1 in “Macrophages”.

Now, let us take an overview of the signaling between the cell types.

Let us get deeper and look at the signaling between “T-cells” and “B-cells” for example.

The following command will save these plots in the images folder.


Marker analysis on a cancer dataset

For this example we use the scRNAseq dataset from Tirosh et al. [4]. We use only the data from patient 80.

Remark: One can notice that the zero rate is lower than in the previous example which reflects the fact that the sequencing is deeper.
We know that this dataset is composed of melanoma cells and their microenvironment, we hence define our markers table using the markers() function.

Let us perform the clustering. For this example, we set the method argument to “kmeans” and the n argument to 12.

Now we take advantage of the markers argument of the cluster_analysis() function using my.markers obtained above with the markers() function.

We can see that the clusters 2 and 5 are well defined, they are respectively cancer associated fibroblasts (CAFs) and melanoma cells. The cluster 6 is also clearly composed of endothelial cells. Clusters 1 and 2 are immune cells but the clustering did not succeed in sorting them correctly and cluster 4 counts only 6 cells. Those do not seem to be homogeneous and we decide to remove them.

Then we can name our clusters manually before pursuing the analysis.

Remark: the names of the dge tables in the cluster_analysis folder must be changed according to the cluster names (c.names).
And now visualize!

Remark: We observe that in the chord diagrams above, the “specific” interactions were highlighted with a thick black line.
Let us look at one of these specific interactions using the expression_plot_2() function.

red
red
red

Thank you for reading this guide and for using SingleCellSignalR.


References

  1. Wang B, Zhu J, Pierson E, Ramazzotti D, Batzoglou S. Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning. Nat Methods. 2017;14:414-6.

  2. McCarthy DJ, Chen Y, Smyth GK. Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Res. 2012;40:4288-97.

  3. 8k PBMCs from a Healthy Donor [Internet]. 2017. Available from: https://support.10xgenomics.com/single-cell-gene-expression/datasets/2.1.0/pbmc8k

  4. Tirosh I, Izar B, Prakadan SM, Wadsworth MH, Treacy D, Trombetta JJ, et al. Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science. 2016;352:189-96.