stability {evaluomeR} | R Documentation |
This analysis permits to estimate whether the clustering is meaningfully
affected by small variations in the sample. First, a clustering using the
k-means algorithm is carried out. The value of k
can be provided by the user.
Then, the stability index is the mean of the Jaccard coefficient
values of a number of bs
bootstrap replicates. The values are in the range [0,1],
having the following meaning:
Unstable: [0, 0.60[.
Doubtful: [0.60, 0.75].
Stable: ]0.75, 0.85].
Highly Stable: ]0.85, 1].
stability(data, k = 5, bs = 100, cbi = "kmeans", getImages = TRUE, seed = NULL)
data |
A |
k |
Positive integer. Number of clusters between [2,15] range. |
bs |
Positive integer. Bootstrap value to perform the resampling. |
cbi |
Clusterboot interface name (default: "kmeans"):
"kmeans", "clara", "clara_pam", "hclust", "pamk", "pamk_pam", "pamk".
Any CBI appended with '_pam' makes use of |
getImages |
Boolean. If true, a plot is displayed. |
seed |
Positive integer. A seed for internal bootstrap. |
A ExperimentList
containing the stability and cluster measurements
for k clusters.
Milligan GW, Cheng R (1996). “Measuring the influence of individual data points in a cluster analysis.” Journal of classification, 13(2), 315–335.
Jaccard P (1901). “Distribution de la flore alpine dans le bassin des Dranses et dans quelques regions voisines.” Bull Soc Vaudoise Sci Nat, 37, 241–272.
# Using example data from our package data("ontMetrics") result <- stability(ontMetrics, k=6, getImages=TRUE)