reconsi {reconsi} | R Documentation |
Perform simultaneous inference through collapsed resampling null distributions
reconsi(Y, x = NULL, B = 1000L, test = "wilcox.test", argList = list(), distFun = "pnorm", quantileFun = "qnorm", densFun = NULL, zValues = TRUE, testPargs = list(), z0Quant = pnorm(c(-1, 1)), gridsize = 801L, maxIter = 1000L, tol = 1e-08, center = FALSE, replace = is.null(x), zVals = NULL, estP0args = list(z0quantRange = seq(0.05, 0.45, 0.0125), smooth.df = 3), permZvals = FALSE, smoothObs = TRUE, tieBreakRan = identical(test, "wilcox.test"), warnConvergence = TRUE)
Y |
the matrix of sequencing counts |
x |
a grouping factor. If provided, this grouping factor is permuted. Otherwise a bootstrap procedure is performed |
B |
the number of resampling instances |
test |
Character string, giving the name of the function to test for differential absolute abundance. Must accept the formula interface. Features with tests resulting in observed NA test statistics will be discarded |
argList |
A list of arguments, passed on to the testing function |
distFun, quantileFun, densFun |
the distribution, quantile and density functions of the test statistic, or its name. Must at least accept an argument named 'q', 'p' and 'x' respectively. |
zValues |
A boolean, should test statistics be converted to z-values. See details |
testPargs |
A list of arguments passed on to distFun/quantileFun/densFun |
z0Quant |
A vector of length 2 of quantiles of the null distribution, in between which only null values are expected |
gridsize |
The number of bins for the kernel density estimates |
maxIter |
An integer, the maximum number of iterations in the estimation of the null distribution |
tol |
The tolerance for the infinity norm of the central borders in the iterative procedure |
center |
A boolean, should observations be centered in each group prior to permuations? See details. |
replace |
A boolean. Should resampling occur with replacement (boostrap) or without replacement (permutation) |
zVals |
An optional list of observed (statObs) and resampling (statsPerm) z-values. If supplied, the calculation of the observed and resampling test statistics is skipped and the algorithm proceeds with calculation of the consensus null distribution |
estP0args |
A list of arguments passed on to the estP0 function |
permZvals |
A boolean, should resampling rather than theoretical null distributions be used? |
smoothObs |
A boolean, should the fitted rather than estimated observed distribution be used in the Fdr calculation? |
tieBreakRan |
A boolean, should ties of resampling test statistics be broken randomly? If not, midranks are used |
warnConvergence |
Should a warning be thrown when the estimation of the random null does not converge? |
Efron (2007) centers the observations in each group prior to permutation. As permutations will remove any genuine group differences anyway, we skip this step by default. If zValues = FALSE, the density is fitted on the original test statistics rather than converted to z-values. This unlocks the procedure for test statistics with unknown distributions, but may be numerically less stable.
A list with entries
statsPerm |
Resampling Z-values |
statObs |
Observed Z-values |
weightsStrat |
Weighting strategy used |
densFun,distFun,quantileFun |
Density, distribution and quantile function as given |
testPargs |
Same as given |
weightStrat |
The weighting strategy |
zValues |
z-values |
permZvals |
z-values from resampling null distribution |
cdfValObs |
Cumulative distribution function evaluation of observed test statistics |
Ideally, it would be better to only use unique resamples, to avoid unnecesarry replicated calculations of the same test statistics. Yet this issue is almost alwyas ignored in practice; as the sample size grows it also becomes irrelevant. Notice also that this would require to place weights in case of the bootstrap, as some bootstrap samples are more likely than others.
#Important notice: low number of resamples B necessary to keep # computation time down, but not recommended. Pray set B at 200 or higher. p = 80; n = 20; B = 2e1 x = rep(c(0,1), each = n/2) mat = cbind( matrix(rnorm(n*p/10, mean = 5+x), n, p/10), #DA matrix(rnorm(n*p*9/10, mean = 5), n, p*9/10) #Non DA ) fdrRes = reconsi(mat, x, B = B) fdrRes$p0 #Indeed close to 0.9 estFdr = fdrRes$Fdr #The estimated tail-area false discovery rates. #With another type of test. Need to supply quantile function in this case fdrResLm = reconsi(mat, x, B = B, test = function(x, y){ fit = lm(y~x) c(summary(fit)$coef["x","t value"], fit$df.residual)}, distFun = function(q){pt(q = q[1], df = q[2])}) #With a test statistic without known null distribution(for small samples) fdrResKruskal = reconsi(mat, x, B = B, test = function(x, y){ kruskal.test(y~x)$statistic}, zValues = FALSE) #Provide an additional covariate through the 'argList' argument z = rpois(n , lambda = 2) fdrResLmZ = reconsi(mat, x, B = B, test = function(x, y, z){ fit = lm(y~x+z) c(summary(fit)$coef["x","t value"], fit$df.residual)}, distFun = function(q){pt(q = q[1], df = q[2])}, argList = list(z = z)) #When nog grouping variable is provided, a bootstrap is performed matBoot = cbind( matrix(rnorm(n*p/10, mean = 1), n, p/10), #DA matrix(rnorm(n*p*9/10, mean = 0), n, p*9/10) #Non DA ) fdrResBoot = reconsi(matBoot, B = B, test = function(y, x){testRes = t.test(y, mu = 0, var.equal = TRUE); c(testRes$statistic, testRes$parameter)}, distFun = function(q){pt(q = q[1], df = q[2])}, center = TRUE, replace = TRUE)