find_cna_driven_gene {iGC} | R Documentation |
The function finds CNA-driven differentially expressed gene and returns the corresponding p-value, false discovery rate, and associated statistics. The result includes three tables which collects information for gain-, loss-, and both-driven genes.
find_cna_driven_gene(gene_cna, gene_exp, gain_prop = 0.2, loss_prop = 0.2, progress = TRUE, progress_width = 32, parallel = FALSE)
gene_cna |
Joint CNA table from create_gene_cna. |
gene_exp |
Joint gene expression table from create_gene_exp. |
gain_prop |
Minimum proportion of the gain samples to be consider CNA-gain. Default is 0.2. |
loss_prop |
Minimum proportion of the loss samples to be consider CNA-loss. Default is 0.2. |
progress |
Whether to display a progress bar. By default |
progress_width |
The text width of the shown progress bar. By default is 48 chars wide. |
parallel |
Enable parallelism by plyr. One has to specify a parallel engine beforehand. See example for more information. |
The gene is considered CNA-gain if the proportion of the sample exhibiting
gain exceeds the threshold gain_prop
, that is, number of samples
having gain_loss
= 1. Reversely, the gene is considered CNA-loss if
%samples that gain_loss
= -1 is below a given threshold
loss_prop
.
When performing the t-test, sample grouping depends on the analysis scenario being either CNA-gain or CNA-loss driven. In CNA-gain driven scenario, two groups, CNA-gain and the other samples, are made. In CNA-loss driven scenario, group CNA-loss and the others are made. Genes that appear in both scenarios will be collected into a third table and excluded from their original tables.
See the vignette for usage of this function by a thorough example.
List of three data.table objects for CNA-driven scenarios: gain, loss, and both, which can be accessed by names: 'gain_driven', 'loss_driven' and 'both'.
require(data.table) ## Create gene_exp and gene_cna manually. The following shows an example ## consisting of 3 genes (BRCA2, TP53, and GNPAT) and 5 samples (A to E). gene_exp <- data.table( GENE = c("BRCA2", "TP53", "GNPAT"), A = c(-0.95, 0.89, 0.21), B = c(1.72, -0.05, NA), C = c(-1.18, 1.15, 2.47), D = c(-1.24, -0.07, 1.2), E = c(1.01, 0.93, 1.54) ) gene_cna <- data.table( GENE = c("BRCA2", "TP53", "GNPAT"), A = c(1, 1, NA), B = c(-1, -1, 1), C = c(1, -1, 1), D = c(1, -1, -1), E = c(0, 0, -1) ) ## Find CNA-driven genes cna_driven_genes <- find_cna_driven_gene( gene_cna, gene_exp, progress=FALSE ) # Gain driven genes cna_driven_genes$gain_driven # Loss driven genes cna_driven_genes$loss_driven # Gene shown in both gain and loss records cna_driven_genes$both