Perform CLARANS clustering — clustering

Function to perform a CLARANS clustering in a hard or fuzzy way. The function can either be called using a common dissimilarity metric or a self-defined distance function.

Usage

clustering_clarans(
  data,
  clusters = 5,
  metric = "euclidean",
  type = "hard",
  num_local = 5,
  max_neighbors = 100,
  cores = 1,
  seed = 1234,
  m = 1.5,
  verbose = 1,
  ...
)

Arguments

data: data.frame to be clustered
clusters: Number of clusters. Defaults to 5.
metric: A character specifying a predefined dissimilarity metric (like "euclidean" or "manhattan") or a self-defined dissimilarity function. Defaults to "euclidean". Will be passed as argument method to dist, so check ?proxy::dist for full details.
type: One of c("hard","fuzzy"), specifying the type of clustering to be performed.
num_local: Number of clustering iterations. Defaults to 5. (pam or vegclust)
max_neighbors: Maximum number of randomized medoid searches with each cluster. Defaults to 100.
cores: Numbers of cores for computation. cores > 1 implies a parallel call. Defaults to 1.
seed: Random number seed. Defaults to 1234.
m: Fuzziness exponent (only for type = "fuzzy"), which has to be a numeric of minimum 1. Defaults to 2.
verbose: Can be set to integers between 0 and 2 to control the level of detail of the printed diagnostic messages. Higher numbers lead to more detailed messages. Defaults to 1.
...: Additional arguments passed to the main clustering algorithm and to proxy::dist for the calculation of the distance matrix (pam or vegclust)

Value

Object of class fuzzyclara

Details

If the clustering is run on mulitple cores, the verbose messages are printed in a file clustering_progress.log (if verbose > 0).

References

Ng, R. T., and Han, J. (2002). CLARANS: A method for clustering objects for spatial data mining. IEEE transactions on knowledge and data engineering, 14(5), 1003–1016. doi:10.1109/tkde.2002.1033770 .