avg_probe_exp {multiClust} | R Documentation |
Function to produce a matrix containing the average expression of each gene probe within each sample cluster.
avg_probe_exp(sel.exp, samp_cluster, data_name, cluster_type = "HClust", distance = "euclidean", linkage_type = "ward.D2", probe_rank = "SD_Rank", probe_num_selection = "Fixed_Probe_Num", cluster_num_selection = "Fixed_Clust_Num")
sel.exp |
Object containing the numeric selected gene expression matrix. This object is an output of the probe_ranking function. |
samp_cluster |
Object vector containing the samples and the cluster number they belong to. This is an output of the cluster_analysis function. |
data_name |
String indicating the cancer type and name of the dataset being analyzed. This name will be used to label the sample dendrograms and heatmap files. |
cluster_type |
String indicating the type of clustering method used in the cluster_analysis function. "Kmeans" or "HClust" are the two options. |
distance |
String describing the distance metric uses for HClust in the cluster_analysis function. Options include one of "euclidean", "maximum", manhattan", "canberra", "binary", or "minkowski". |
linkage_type |
String describing the linkage metric used in the cluster_analysis function. Options include "ward.D2", "average", "complete", "median", "centroid", "single", and "mcquitty". |
probe_rank |
String indicating the feature selection method used in the probe_ranking function. Options include "CV_Rank", "CV_Guided", "SD_Rank", and "Poly". |
probe_num_selection |
String indicating the way in which probes were selected in the number_probes function. Options include "Fixed_Probe_Num", "Percent_Probe_Num", and "Adaptive_Probe_Num". |
cluster_num_selection |
String indicating how the number of clusters were determined in the number_clusters function. Options include "Fixed_Clust_Num" and "Gap_Statistic". |
Returns an object matrix with the average mean expression for each probe in each sample cluster. Also outputs the object to a text file.
Nathan Lawlor, Alec Fabbri
number_clusters
, number_probes
,
probe_ranking
, cluster_analysis
# Produce matrix of average expression of each probe in each cluster # Load in a data file data_file <- system.file("extdata", "GSE2034.normalized.expression.txt", package="multiClust") data <- input_file(input=data_file) # Choose 300 genes to select for gene_num <- number_probes(input=data_file, data.exp=data, Fixed=300, Percent=NULL, Adaptive=NULL) # Choose the "CV_Rank" Method for gene ranking sel.data <- probe_ranking(input=data_file, probe_number=300, probe_num_selection="Fixed_Probe_Num", data.exp=data, method="CV_Rank") # Choose a fixed cluster number of 3 clust_num <- number_clusters(data.exp=data, Fixed=3, gap_statistic=NULL) # Call function for Kmeans parameters kmeans_analysis <- cluster_analysis(sel.exp=sel.data, cluster_type="Kmeans", distance=NULL, linkage_type=NULL, gene_distance=NULL, num_clusters=3, data_name="GSE2034 Breast", probe_rank="CV_Rank", probe_num_selection="Fixed_Probe_Num", cluster_num_selection="Fixed_Clust_Num") # Call function for average matrix expression calculation avg_matrix <- avg_probe_exp(sel.exp=sel.data, samp_cluster=kmeans_analysis, data_name="GSE2034 Breast", cluster_type="Kmeans", distance=NULL, linkage_type=NULL, probe_rank="CV_Rank", probe_num_selection="Fixed", cluster_num_selection="Fixed_Clust_Num")