R. A. Jarvis and E. A. Patrick. 1973. Clustering Using a Similarity Measure Based on Shared Near Neighbors. IEEE Trans. Comput. 22, 11 (November 1973), 1025-1034. DOI=http://dx.doi.org/10.1109/T-C.1973.223640
- Step 1: For each point of the data set list the k nearest neighbors.
- Step 2: Set up an integer label table of length n, with each entry initially set to the first entry of the corresponding neighborhood row.
- Step 3: All possible pairs of neighborhood rows are tested in the following manner. Replace both label entries by the smaller of the two existing entries if both zeroth neighbors (the points being tested) are found in both neighborhood rows and at least kt neighbor matches exist between the two rows (kt is referred to as the similarity threshold). Also, replace all appearances of the higher label (throughout the entire label table) with the lower label if the above test is successful.
- Step 4: The clusters under the k, kt selections are now indicated by identical labeling of the points belonging to the clusters.
library("dbscan") # for kNN search
library("mlbench") # for data
Create Spirals Data
Spirals <- mlbench.spirals(500, 1, 0.05)
x <- Spirals$x
Try with clusters with different density
# x <- rbind(
# matrix(rnorm(n = 2*100, mean = -1, sd = .2), ncol = 2),
# matrix(rnorm(n = 2*200, mean = 1, sd = 1), ncol = 2)