rForest {statTarget} | R Documentation |
rForest provides the Breiman's random forest algorithm for classification and permutation-based variable importance measures (PIMP-algorithm).
rForest(file,ntree = 100,times = 100, gDist = TRUE, seed = 123,...)
file |
An data frame or 'Stat File' from statTarget software. |
ntree |
Number of trees to grow. This should not be set to too small a number, to ensure that every input row gets predicted at least a few times. |
times |
The number of permutations for permutation-based variable importance measures. |
gDist |
If gDist is TRUE the null importance distributions are approximated with Gaussian distributions else with empirical cumulative distributions. |
seed |
For the same set of random variables and reproducible results. |
... |
A generic function in randomForest package |
Objects Two objects from statTarget_rForest (1. randomForest,rfModel; 2. PIMPresult, pimpModel)
VarImp The original Gini importance
PerVarImp A matrix, where the permuted VarImp measures for the predictor variable.
p-value The probability of observing the original VarImp or a larger value, given the fitted null importance distribution.
p.ks.test The p-values of the Kolmogorov-Smirnov Tests for each row PerVarImp.
Hemi Luan, hemi.luan@gmail.com
Altmann A.,Tolosi L.,Sander O. and Lengauer T. (2010) Permutation importance: a corrected feature importance measure, Bioinformatics 26 (10), 1340-1347.
Ender Celik. (2015) vita: Variable Importance Testing Approaches. R package version 1.0.0 https://CRAN.R-project.org/package=vita
datpath <- system.file('extdata',package = 'statTarget') statFile <- paste(datpath,'data_example.csv', sep='/') getFile <- read.csv(statFile,header=TRUE) rFtest <- rForest(getFile,ntree = 10,times = 5)