LINKER_runPhase1 {TraRe}R Documentation

Phase I : module generation

Description

Run first phase of the linker method where K modules of similarly expressed target genes and relate them to a linear combination of very few regulators, according to the selected model. LINKER_init() evaluate kmeans on a train set to generate a initial set of clusters containing drivers and target genes. LINKER_ReassignGenesToClusters() reassigning genes based on closed match to new regulatory programs. This functions takes place inside the linkerrun function, so it is not recommended to run it on its own. LINKER_corrClust() go through two steps within a loop, learning regulatory program of modules and reassigning genes. LINKER_extract_modules() extract all the modules, genes and relevant information. LINKER_EvaluateTestSet() fits the selected model with the test data. LINKER_LearnRegulatoryPrograms() learns the regulatory program of the modules.

Usage

LINKER_runPhase1(
  lognorm_est_counts,
  target_filtered_idx,
  regulator_filtered_idx,
  nassay = 1,
  regulator = "regulator",
  NrModules,
  Lambda = 1e-04,
  alpha = 1 - 1e-06,
  pmax = 10,
  mode = "VBSR",
  used_method = "MEAN",
  NrCores = 1,
  corrClustNrIter = 100,
  Nr_bootstraps = 1
)

LINKER_init(
  MA_matrix_Var,
  RegulatorData,
  NrModules,
  NrCores = 3,
  corrClustNrIter = 21,
  Parameters
)

LINKER_ReassignGenesToClusters(
  Data,
  RegulatorData,
  Beta,
  Clusters,
  NrCores = 1
)

LINKER_corrClust(LINKERinit)

LINKER_extract_modules(results)

LINKER_EvaluateTestSet(
  LINKERresults,
  MA_Data_TestSet,
  RegulatorData_TestSet,
  used_method = "MEAN"
)

LINKER_LearnRegulatoryPrograms(
  Data,
  Clusters,
  RegulatorData,
  Lambda,
  alpha,
  pmax,
  mode,
  used_method = "MEAN",
  NrCores = 1
)

Arguments

lognorm_est_counts

Matrix of log-normalized estimated counts of the gene expression data (Nr Genes x Nr samples)

target_filtered_idx

Index array of the target genes on the lognorm_est_counts matrix if SummarizedExperiment object is not provided.

regulator_filtered_idx

Index array of the regulatory genes on the lognorm_est_counts matrix if SummarizedExperiment object is not provided.

nassay

if SummarizedExperiment object is passed as input to lognorm_est_counts, name of the assay containing the desired matrix. Default: 1 (first item in assay's list).

regulator

if SummarizedExperiment object is passed as input to lognorm_est_counts, name of the rowData() variable to build target_filtered_idx and regulator_filtered_idx. This variable must be one for driver genes and zero for target genes. Default: 'regulator'

NrModules

Number of modules that are a priori to be found (note that the final number of modules discovered may differ from this value). By default, 100 modules.

Lambda

Lambda variable for Lasso models.

alpha

Alpha variable for Lasso models.

pmax

Maximum numbers of regulators that we want.

mode

Chosen method(s) to link module eigengenes to regulators. The available options are 'VBSR', 'LASSOmin', 'LASSO1se' and 'LM'. Default set to 'VBSR'

used_method

Method selected for use. Default set to MEAN.

NrCores

Nr of computer cores for the parallel parts of the method. Note that the parallelization is NOT initialized in any of the functions. By default, 2.

corrClustNrIter

Number of iteration for the phase I part of the method.

Nr_bootstraps

Number of bootstrap of Phase I. By default, 1.

MA_matrix_Var

Matrix of log-normalized estimated counts of the gene expression data, centered and scaled, containing only the train samples.

RegulatorData

Expression matrix containing only the regulators of the train samples.

Parameters

List of parameters containig lambda, pmax, alpha, mode and used method.

Data

Matrix of log-normalized estimated counts of the gene expression data, centered and scaled, containing only the train samples.

Beta

Coefficient on which the decision of reassigning genes is based.

Clusters

Clusters generated from the linkerinit function.

LINKERinit

Initialization object obtained from LINKER_init().

results

Matrix of log-normalized estimated counts of the gene expression data (Nr Genes x Nr samples).

LINKERresults

List containing the number of clusters, regulatoryprogram, name of regulators and all genes and module membership.

MA_Data_TestSet

Matrix of log-normalized estimated counts of the gene expression data, centered and scaled, containing only the test samples.

RegulatorData_TestSet

Expression matrix containing only the regulators of the test samples.

Value

igraph object containing the modules containing the related drivers and targets within bootstraps.

Examples


   ## This example is very similar to the `LINKER_run()` function.
   ## Again, we are going to join drivers and targets genes to create the working dataset.

   drivers <- readRDS(paste0(system.file('extdata',package='TraRe'),'/tfs_linker_example.rds'))
   targets <- readRDS(paste0(system.file('extdata',package='TraRe'),'/targets_linker_example.rds'))

   lognorm_est_counts <- rbind(drivers,targets)
   ## We create the index for drivers and targets in the log-normalized gene expression matrix.

   R<-60
   T<-200

   regulator_filtered_idx <- seq_len(R)
   target_filtered_idx <- R+c(seq_len(T))

   ## We recommend to use the default values of the function.
   ## For the sake of time, we will select faster (and worse) ones.

   linkeroutput <- LINKER_runPhase1(lognorm_est_counts,target_filtered_idx=target_filtered_idx,
                                    regulator_filtered_idx=regulator_filtered_idx, NrModules=2,
                                    mode='LASSOmin',NrCores=2, corrClustNrIter=10,Nr_bootstraps=1)


[Package TraRe version 1.2.0 Index]