stabilityRange {evaluomeR} | R Documentation |
This analysis permits to estimate whether the clustering is meaningfully
affected by small variations in the sample. For a range of k values (k.range
),
a clustering using the k-means algorithm is carried out.
Then, the stability index is the mean of the Jaccard coefficient
values of a number of bs
bootstrap replicates. The values are in the range [0,1],
having the following meaning:
Unstable: [0, 0.60[.
Doubtful: [0.60, 0.75].
Stable: ]0.75, 0.85].
Highly Stable: ]0.85, 1].
stabilityRange(data, k.range = c(2, 15), bs = 100, cbi = "kmeans", getImages = TRUE, seed = NULL)
data |
A |
k.range |
Concatenation of two positive integers.
The first value |
bs |
Positive integer. Bootstrap value to perform the resampling. |
cbi |
Clusterboot interface name (default: "kmeans"):
"kmeans", "clara", "clara_pam", "hclust", "pamk", "pamk_pam", "pamk".
Any CBI appended with '_pam' makes use of |
getImages |
Boolean. If true, a plot is displayed. |
seed |
Positive integer. A seed for internal bootstrap. |
A ExperimentList
containing the stability and cluster measurements
for 2 to k
clusters.
Milligan GW, Cheng R (1996). “Measuring the influence of individual data points in a cluster analysis.” Journal of classification, 13(2), 315–335.
Jaccard P (1901). “Distribution de la flore alpine dans le bassin des Dranses et dans quelques regions voisines.” Bull Soc Vaudoise Sci Nat, 37, 241–272.
# Using example data from our package data("ontMetrics") result <- stabilityRange(ontMetrics, k.range=c(2,3))