Introduction
Installation
SpidermiRquery: Searching network
SpidermiRdownload: Downloading network data
SpidermiRprepare: Preparing the data
- SpidermiRprepare_NET: Prepare matrix of gene network with Ensembl Gene ID, and gene symbols
SpidermiRanalyze: : Analyze data from network data
Features databases SpidermiR:
References

Introduction

Biological systems are composed of multiple layers of dynamic interaction networks. These networks can be decomposed, for example, into: co-expression, physical, co-localization, genetic, pathway, and shared protein domains.

GeneMania provides us with an enormous collection of data sets for interaction network studies (Warde-Farley D, Donaldson S, Comes O, Zuberi K, Badrawi R, and others 2010). The data can be accessed and downloaded from different database, using a web portal. But currently, there is not a R-package to query and download these data.

An important regulatory mechanism of these network data involves microRNAs (miRNAs). miRNAs are involved in various cellular functions, such as differentiation, proliferation, and tumourigenesis. However, our understanding of the processes regulated by miRNAs is currently limited and the integration of miRNA data in these networks provides a comprehensive genome-scale analysis of miRNA regulatory networks.Actually, GeneMania doesn’t integrate the information of miRNAs and their interactions in the network.

SpidermiR allows the user to query, prepare, download network data (e.g. from GeneMania), and to integrate this information with miRNA data with the possibility to analyze these downloaded data directly in one single R package. This techincal report gives a short overview of the essential SpidermiR methods and their application.

Installation

To install use the code below.

if (!requireNamespace("BiocManager", quietly=TRUE))
    install.packages("BiocManager")
BiocManager::install("SpidermiR")

`SpidermiRquery`: Searching network

You can easily search GeneMania data using the SpidermiRquery function.

`SpidermiRquery_species`: Searching by species

The user can query the species supported by GeneMania, using the function SpidermiRquery_species:

org<-SpidermiRquery_species(species)

The list of species is shown below:

Table 1: List of species
	tabOrgd[, 2]
1	Arabidopsis_thaliana
2	Caenorhabditis_elegans
3	Danio_rerio
4	Drosophila_melanogaster
5	Escherichia_coli
6	Homo_sapiens
7	Mus_musculus
8	Rattus_norvegicus
9	Saccharomyces_cerevisiae

`SpidermiRquery_networks_type`: Searching by network categories

The user can query the network types supported by GeneMania for a specific specie, using the function SpidermiRquery_networks_type. The user can select a specific specie using an index obtained by the function SpidermiRquery_species (e.g. organismID=org[6,] is the input for Homo_sapiens,organismID=org[9,] is the input for Saccharomyces cerevisiae )

net_type<-SpidermiRquery_networks_type(organismID=org[9,])

The list of network categories in Saccharomyces cerevisiae is shown below:

## [1] "Co-localization"        "Genetic Interactions"   "Physical Interactions" 
## [4] "Predicted"              "Co-expression"          "Other"                 
## [7] "Shared protein domains"

`SpidermiRquery_spec_networks`: Searching by species, and network categories

You can filter the search by species using organism ID (above reported), and the network category. The network category can be filtered using the following parameters:

COexp Co-expression
PHint Physical_interactions
COloc Co-localization
GENint Genetic_interactions
PATH Pathway
SHpd Shared_protein_domains
pred predicted

net_shar_prot<-SpidermiRquery_spec_networks(organismID = org[9,],
                                    network = "SHpd")

The databases, which data are collected, are the output of this step. An example is shown below ( for Shared protein domains in Saccharomyces_cerevisiae data are collected in INTERPRO, and PFAM):

## [1] "http://genemania.org/data/current/Saccharomyces_cerevisiae/Shared_protein_domains.INTERPRO.txt"
## [2] "http://genemania.org/data/current/Saccharomyces_cerevisiae/Shared_protein_domains.PFAM.txt"

`SpidermiRdownload`: Downloading network data

The user in this step can download the data, as previously queried.

`SpidermiRdownload_net`: Download network

The user can download the data (previously queried) with SpidermiRdownload_net.

out_net<-SpidermiRdownload_net(net_shar_prot)

## [1] "Downloading: http://genemania.org/data/current/Saccharomyces_cerevisiae/Shared_protein_domains.INTERPRO.txt ... reference n. 1 of 2"
## [1] "Downloading: http://genemania.org/data/current/Saccharomyces_cerevisiae/Shared_protein_domains.PFAM.txt ... reference n. 2 of 2"

The list of SpidermiRdownload_net is shown below:

## List of 2
##  $ :'data.frame':    58612 obs. of  3 variables:
##   ..$ Gene_A: chr [1:58612] "Q0050" "Q0050" "Q0055" "Q0050" ...
##   ..$ Gene_B: chr [1:58612] "Q0055" "Q0060" "Q0060" "Q0065" ...
##   ..$ Weight: num [1:58612] 0.27 0.048 0.12 0.048 0.12 0.17 0.048 0.12 0.17 0.17 ...
##  $ :'data.frame':    25587 obs. of  3 variables:
##   ..$ Gene_A: chr [1:25587] "Q0050" "Q0060" "Q0060" "Q0065" ...
##   ..$ Gene_B: chr [1:25587] "Q0055" "Q0065" "Q0070" "Q0070" ...
##   ..$ Weight: num [1:25587] 1 0.21 0.21 0.21 0.11 0.11 0.11 0.081 0.081 0.081 ...

`SpidermiRdownload_miRNAprediction`: Downloading miRNA predicted data target

The user can download the predicted miRNA-gene from 4 databases:DIANA, Miranda, PicTar and TargetScan using miRNAtap (Pajak M, Simpson TI 2019).

mirna<-c('hsa-miR-567','hsa-miR-566')
SpidermiRdownload_miRNAprediction(mirna_list=mirna)

`SpidermiRdownload_miRNAvalidate`: Downloading miRNA validated data target

The user can download the validated miRNA-gene from: miRTAR and miRwalk (Jiang Q, Wang Y, Hao Y, Juan L, Teng M, Zhang X, Li M, Wang G, Liu Y. 2009) (Dweep H, Sticht C, Pandey P, Gretz N. 2011).

list<-SpidermiRdownload_miRNAvalidate(validated)

`SpidermiRdownload_miRNAextra_cir`:Download Extracellular Circulating microRNAs

The user can download extracellular circulating miRNAs from miRandola database

list_circ<-SpidermiRdownload_miRNAextra_cir(miRNAextra_cir)

`SpidermiRprepare`: Preparing the data

`SpidermiRprepare_NET`: Prepare matrix of gene network with Ensembl Gene ID, and gene symbols

SpidermiRprepare_NET reads network data from SpidermiRdownload_net and enables user to prepare them for downstream analysis. In particular, it prepares matrix of gene network mapping Ensembl Gene ID to gene symbols. Gene symbols are needed to integrate miRNAdata.

geneSymb_net<-SpidermiRprepare_NET(organismID = org[9,],
                                    data = out_net)

## [1] "Preprocessing of the network n. 1 of 2"
## [1] "Preprocessing of the network n. 2 of 2"

The network with gene symbols ID is shown below:

Table 2: shared protein domain
Gene_A	Gene_B	Weight	gene_symbolA	gene_symbolB
Q0050	Q0055	0.27	NP_009310.1	NP_009309.1
Q0050	Q0060	0.05	NP_009310.1	NP_009308.2
Q0055	Q0060	0.12	NP_009309.1	NP_009308.2
Q0050	Q0065	0.05	NP_009310.1	NP_009307.2
Q0055	Q0065	0.12	NP_009309.1	NP_009307.2

`SpidermiRanalyze`: : Analyze data from network data

`SpidermiRanalyze_direct_net`: Searching by biomarkers of interest with direct interaction

Starting from a set of biomarkers of interest (BI), genes, miRNA or both, given by the user, this function finds sub-networks including all direct interactions involving at least one of the BI.

biomark_of_interest<-c("hsa-let-7a","CDC34","hsa-miR-27a","PEX7","EPT1","FOX","hsa-miR-5a")
miRNA_NET <-data.frame(V1=c('hsa-let-7a','CASP3','BRCA','hsa-miR-7a','hsa-miR-5a','SMAD','SOX'),V2=c('CASP3','TAMOXIFEN','MYC','PTEN','FOX','HIF1','P53'),stringsAsFactors=FALSE)
GIdirect_net<-SpidermiRanalyze_direct_net(data=miRNA_NET,BI=biomark_of_interest)

## [1] "CDC34 is not in the network or please check the correct name"
## [1] "hsa-miR-27a is not in the network or please check the correct name"
## [1] "PEX7 is not in the network or please check the correct name"
## [1] "EPT1 is not in the network or please check the correct name"

The data frame of SpidermiRanalyze_direct_net, GIdirect_net, is shown below:

## 'data.frame':    2 obs. of  2 variables:
##  $ V1: chr  "hsa-let-7a" "hsa-miR-5a"
##  $ V2: chr  "CASP3" "FOX"

`SpidermiRanalyze_direct_subnetwork`: Network composed by only the nodes in a set of biomarkers of interest

Starting from BI, this function finds sub-networks including all direct interactions involving only BI.

subnet<-SpidermiRanalyze_direct_subnetwork(data=miRNA_NET,BI=biomark_of_interest)

`SpidermiRanalyze_subnetwork_neigh`: Network composed by the nodes in the list of BI and all the edges among this brunch of nodes.

Starting from BI, this function finds sub-networks including all direct and indirect interactions involving at least one of BI.

GIdirect_net_neigh<-SpidermiRanalyze_subnetwork_neigh(data=miRNA_NET,BI=biomark_of_interest)

`SpidermiRanalyze_degree_centrality`: Ranking degree centrality genes

This function finds the number of direct neighbours of a node in a network and allows the selection of those nodes with a number of direct neighbours higher than a selected cut-off.

top10_cent_gene<-SpidermiRanalyze_degree_centrality(miRNA_NET)

`Features databases SpidermiR`:

Features of databases integrated in SpidermiR are:

Table 3: Features
CATEGORY	EXTERNAL DATABASE	VERSION	LAST UPDATE	LINK
Gene network	GeneMania	Current	2017	http://genemania.org/data/current/
Validated miRNA-target	miRwalk	miRwalk2	2015	http://zmf.umm.uni-heidelberg.de/apps/zmf/mirwalk2/downloads/vtm/hsa-vtm-gene.rdata.zip
	miRTarBase	miRTarBase 7	2017	http://mirtarbase.mbc.nctu.edu.tw/cache/download/7.0/miRTarBase_SE_WR.xls
Predicted miRNA-target	DIANA	DIANA- 5.0	2013	https://bioconductor.org/packages/release/bioc/html/miRNAtap.html
	Miranda	N/A	2010	https://bioconductor.org/packages/release/bioc/html/miRNAtap.html
	PicTar	N/A	N/A	https://bioconductor.org/packages/release/bioc/html/miRNAtap.html
	TargetScan	TargetScan7.1	2016	https://bioconductor.org/packages/release/bioc/html/miRNAtap.html
Extracellular Circulating microRNAs	miRandola	miRandola v 02/2017	2017	http://mirandola.iit.cnr.it/download/miRandola_version_02_2017.txt
Pharmaco-miR	DGIdb	N/A	2018	http://dgidb.org/data/interactions.tsv
	MATADOR	N/A	N/A	http://matador.embl.de/media/download/matador.tsv.gz

Session Information ******

sessionInfo()

## R version 4.1.0 (2021-05-18)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.2 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.13-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.13-bioc/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
## [1] SpidermiR_1.22.1     miRNAtap_1.26.0      AnnotationDbi_1.54.1
## [4] IRanges_2.26.0       S4Vectors_0.30.0     Biobase_2.52.0      
## [7] BiocGenerics_0.38.0  BiocStyle_2.20.2    
## 
## loaded via a namespace (and not attached):
##   [1] shadowtext_0.0.8            fastmatch_1.1-0            
##   [3] BiocFileCache_2.0.0         plyr_1.8.6                 
##   [5] igraph_1.2.6                lazyeval_0.2.2             
##   [7] splines_4.1.0               BiocParallel_1.26.0        
##   [9] usethis_2.0.1               GenomeInfoDb_1.28.0        
##  [11] ggplot2_3.3.4               digest_0.6.27              
##  [13] htmltools_0.5.1.1           GOSemSim_2.18.0            
##  [15] viridis_0.6.1               GO.db_3.13.0               
##  [17] gdata_2.18.0                fansi_0.5.0                
##  [19] magrittr_2.0.1              memoise_2.0.0              
##  [21] remotes_2.4.0               graphlayouts_0.7.1         
##  [23] Biostrings_2.60.1           readr_1.4.0                
##  [25] matrixStats_0.59.0          R.utils_2.10.1             
##  [27] enrichplot_1.12.1           prettyunits_1.1.1          
##  [29] jpeg_0.1-8.1                colorspace_2.0-1           
##  [31] blob_1.2.1                  rvest_1.0.0                
##  [33] rappdirs_0.3.3              ggrepel_0.9.1              
##  [35] xfun_0.24                   dplyr_1.0.6                
##  [37] callr_3.7.0                 crayon_1.4.1               
##  [39] RCurl_1.98-1.3              jsonlite_1.7.2             
##  [41] scatterpie_0.1.6            ape_5.5                    
##  [43] miRNAtap.db_0.99.10         glue_1.4.2                 
##  [45] polyclip_1.10-0             gtable_0.3.0               
##  [47] zlibbioc_1.38.0             XVector_0.32.0             
##  [49] DelayedArray_0.18.0         pkgbuild_1.2.0             
##  [51] scales_1.1.1                DOSE_3.18.0                
##  [53] DBI_1.1.1                   Rcpp_1.0.6                 
##  [55] viridisLite_0.4.0           progress_1.2.2             
##  [57] tidytree_0.3.4              bit_4.0.4                  
##  [59] sqldf_0.4-11                htmlwidgets_1.5.3          
##  [61] httr_1.4.2                  fgsea_1.18.0               
##  [63] gplots_3.1.1                RColorBrewer_1.1-2         
##  [65] ellipsis_0.3.2              farver_2.1.0               
##  [67] pkgconfig_2.0.3             XML_3.99-0.6               
##  [69] R.methodsS3_1.8.1           sass_0.4.0                 
##  [71] dbplyr_2.1.1                utf8_1.2.1                 
##  [73] reshape2_1.4.4              tidyselect_1.1.1           
##  [75] rlang_0.4.11                munsell_0.5.0              
##  [77] tools_4.1.0                 visNetwork_2.0.9           
##  [79] cachem_1.0.5                downloader_0.4             
##  [81] cli_2.5.0                   gsubfn_0.7                 
##  [83] generics_0.1.0              RSQLite_2.2.7              
##  [85] devtools_2.4.2              evaluate_0.14              
##  [87] stringr_1.4.0               fastmap_1.1.0              
##  [89] yaml_2.2.1                  ggtree_3.0.2               
##  [91] processx_3.5.2              org.Hs.eg.db_3.13.0        
##  [93] knitr_1.33                  bit64_4.0.5                
##  [95] fs_1.5.0                    tidygraph_1.2.0            
##  [97] caTools_1.18.2              purrr_0.3.4                
##  [99] ggraph_2.0.5                KEGGREST_1.32.0            
## [101] TCGAbiolinks_2.20.0         nlme_3.1-152               
## [103] R.oo_1.24.0                 aplot_0.0.6                
## [105] DO.db_2.9                   xml2_1.3.2                 
## [107] biomaRt_2.48.1              MAGeCKFlute_1.12.0         
## [109] compiler_4.1.0              rstudioapi_0.13            
## [111] filelock_1.0.2              curl_4.3.1                 
## [113] png_0.1-7                   testthat_3.0.3             
## [115] treeio_1.16.1               tweenr_1.0.2               
## [117] tibble_3.1.2                bslib_0.2.5.1              
## [119] stringi_1.6.2               highr_0.9                  
## [121] ps_1.6.0                    TCGAbiolinksGUI.data_1.12.0
## [123] desc_1.3.0                  lattice_0.20-44            
## [125] Matrix_1.3-4                vctrs_0.3.8                
## [127] networkD3_0.4               pillar_1.6.1               
## [129] lifecycle_1.0.0             BiocManager_1.30.16        
## [131] jquerylib_0.1.4             cowplot_1.1.1              
## [133] data.table_1.14.0           bitops_1.0-7               
## [135] patchwork_1.1.1             qvalue_2.24.0              
## [137] GenomicRanges_1.44.0        R6_2.5.0                   
## [139] latticeExtra_0.6-29         bookdown_0.22              
## [141] KernSmooth_2.23-20          gridExtra_2.3              
## [143] sessioninfo_1.1.1           MASS_7.3-54                
## [145] gtools_3.9.2                assertthat_0.2.1           
## [147] pkgload_1.2.1               chron_2.3-56               
## [149] SummarizedExperiment_1.22.0 proto_1.0.0                
## [151] rprojroot_2.0.2             withr_2.4.2                
## [153] GenomeInfoDbData_1.2.6      hms_1.1.0                  
## [155] clusterProfiler_4.0.0       grid_4.1.0                 
## [157] tidyr_1.1.3                 rvcheck_0.1.8              
## [159] rmarkdown_2.9               MatrixGenerics_1.4.0       
## [161] ggforce_0.3.3

References

Dweep H, Sticht C, Pandey P, Gretz N. 2011. “miRWalk - Database Prediction of Possible miRNA Binding Sites by ‘Walking’ the Genes of 3 Genomes.”

Jiang Q, Wang Y, Hao Y, Juan L, Teng M, Zhang X, Li M, Wang G, Liu Y. 2009. “miR2Disease a Manually Curated Database for microRNA Deregulation in Human Disease.”

Pajak M, Simpson TI. 2019. “miRNAtap microRNA Targets - Aggregated Predictions.”

Warde-Farley D, Donaldson S, Comes O, Zuberi K, Badrawi R, and others. 2010. “The Gene Mania Prediction Server Biological Network Integration for Gene Prioritization and Predicting Gene Function.”

Working with SpidermiR package

2021-06-17

Contents

Introduction

Installation

`SpidermiRquery`: Searching network

`SpidermiRquery_species`: Searching by species

`SpidermiRquery_networks_type`: Searching by network categories

`SpidermiRquery_spec_networks`: Searching by species, and network categories

`SpidermiRdownload`: Downloading network data

`SpidermiRdownload_net`: Download network

`SpidermiRdownload_miRNAprediction`: Downloading miRNA predicted data target

`SpidermiRdownload_miRNAvalidate`: Downloading miRNA validated data target

`SpidermiRdownload_miRNAextra_cir`:Download Extracellular Circulating microRNAs

`SpidermiRprepare`: Preparing the data

`SpidermiRprepare_NET`: Prepare matrix of gene network with Ensembl Gene ID, and gene symbols

`SpidermiRanalyze`: : Analyze data from network data

`SpidermiRanalyze_direct_net`: Searching by biomarkers of interest with direct interaction

`SpidermiRanalyze_direct_subnetwork`: Network composed by only the nodes in a set of biomarkers of interest

`SpidermiRanalyze_subnetwork_neigh`: Network composed by the nodes in the list of BI and all the edges among this brunch of nodes.

`SpidermiRanalyze_degree_centrality`: Ranking degree centrality genes

`Features databases SpidermiR`:

References

Working with SpidermiR package

2021-06-17

Contents

Introduction

Installation

SpidermiRquery: Searching network

SpidermiRquery_species: Searching by species

SpidermiRquery_networks_type: Searching by network categories

SpidermiRquery_spec_networks: Searching by species, and network categories

SpidermiRdownload: Downloading network data

SpidermiRdownload_net: Download network

SpidermiRdownload_miRNAprediction: Downloading miRNA predicted data target

SpidermiRdownload_miRNAvalidate: Downloading miRNA validated data target

SpidermiRdownload_miRNAextra_cir:Download Extracellular Circulating microRNAs

SpidermiRprepare: Preparing the data

SpidermiRprepare_NET: Prepare matrix of gene network with Ensembl Gene ID, and gene symbols

SpidermiRanalyze: : Analyze data from network data

SpidermiRanalyze_direct_net: Searching by biomarkers of interest with direct interaction

SpidermiRanalyze_direct_subnetwork: Network composed by only the nodes in a set of biomarkers of interest

SpidermiRanalyze_subnetwork_neigh: Network composed by the nodes in the list of BI and all the edges among this brunch of nodes.

SpidermiRanalyze_degree_centrality: Ranking degree centrality genes

Features databases SpidermiR:

References

`SpidermiRquery`: Searching network

`SpidermiRquery_species`: Searching by species

`SpidermiRquery_networks_type`: Searching by network categories

`SpidermiRquery_spec_networks`: Searching by species, and network categories

`SpidermiRdownload`: Downloading network data

`SpidermiRdownload_net`: Download network

`SpidermiRdownload_miRNAprediction`: Downloading miRNA predicted data target

`SpidermiRdownload_miRNAvalidate`: Downloading miRNA validated data target

`SpidermiRdownload_miRNAextra_cir`:Download Extracellular Circulating microRNAs

`SpidermiRprepare`: Preparing the data

`SpidermiRprepare_NET`: Prepare matrix of gene network with Ensembl Gene ID, and gene symbols

`SpidermiRanalyze`: : Analyze data from network data

`SpidermiRanalyze_direct_net`: Searching by biomarkers of interest with direct interaction

`SpidermiRanalyze_direct_subnetwork`: Network composed by only the nodes in a set of biomarkers of interest

`SpidermiRanalyze_subnetwork_neigh`: Network composed by the nodes in the list of BI and all the edges among this brunch of nodes.

`SpidermiRanalyze_degree_centrality`: Ranking degree centrality genes

`Features databases SpidermiR`: