getPatientPredictions {netDx} | R Documentation |
Calculates patient-level classification accuracy across train/test splits
getPatientPredictions(predFiles, pheno, plotAccuracy = FALSE)
predFiles |
(char) vector of paths to all test predictions (e.g. 100 files for a 100 train/test split design). Alternately, the user can also provide a single directory name, and allow the script to retrieve prediction files. Format is 'rootDir/rngX/predictionResults.txt' |
pheno |
(data.frame) ID=patient ID, STATUS=ground truth (known class label). This table is required to get the master list of all patients, as not every patient is classified in every split. |
plotAccuracy |
(logical) if TRUE, shows fraction of times patient is misclassified, using a dot plot |
Takes all the predictions across the different train/test splits, and for each patient, generates a score indicating how many times they were classified by netDx as belonging to each of the classes. The result is that we get a measure of individual classification accuracy across the different train/test splits.
(list) of length 2. 1) (data.frame) rows are patients, (length(predFiles)+2) columns. Columns seq_len(length(predFiles)): Predicted labels for a given split (NA if patient was training sample for the split). Column (length(predFiles)+1): split, value is NA. Columns are : ID, REAL_STATUS, predStatus1,... predStatusN. Side effect of plotting a dot plot of and the value is '
inDir <- system.file("extdata","example_output",package="netDx") data(pheno) all_rngs <- list.dirs(inDir, recursive = FALSE) all_pred_files <- unlist(lapply(all_rngs, function(x) { paste(x, 'predictionResults.txt', sep = getFileSep())})) pred_mat <- getPatientPredictions(all_pred_files, pheno)