Skip to contents

Function to perform a local iteration of the CLARANS clustering algorithm in a hard or fuzzy way. The function can either be called using a common dissimilarity metric or a self-defined distance function.

Usage

clustering_local(
  data,
  sample_local,
  clusters = 5,
  metric = "euclidean",
  max_neighbors = 100,
  type = "hard",
  m = 1.5,
  verbose = 1,
  verbose_toLogFile = FALSE,
  ...
)

Arguments

data

data.frame to be clustered

sample_local

list containing information on pairs of medoids and non-medoids tested for swapping as well as starting medoids for the algorithm

clusters

Number of clusters. Defaults to 5.

metric

A character specifying a predefined dissimilarity metric (like "euclidean" or "manhattan") or a self-defined dissimilarity function. Defaults to "euclidean". Will be passed as argument method to dist, so check ?proxy::dist for full details.

max_neighbors

Maximum number of randomized medoid searches with each cluster (only if algorithm = "clarans")

type

One of c("hard","fuzzy"), specifying the type of clustering to be performed.

m

Fuzziness exponent (only for type = "fuzzy"), which has to be a numeric of minimum 1. Defaults to 2.

verbose

Can be set to integers between 0 and 2 to control the level of detail of the printed diagnostic messages. Higher numbers lead to more detailed messages. Defaults to 1.

verbose_toLogFile

If TRUE, the diagnostic messages are printed to a log file clustering_progress.log. Defaults to FALSE.

...

Additional arguments passed to the main clustering algorithm (pam or vegclust)

Value

Clustering solution for data sample

References

Ng, R. T., and Han, J. (2002). CLARANS: A method for clustering objects for spatial data mining. IEEE transactions on knowledge and data engineering, 14(5), 1003–1016. doi:10.1109/tkde.2002.1033770 .