Clusters

Clustering Algorithms Comparison

Compare different clustering algorithms on classic datasets. The datasets shown here are commonly used to demonstrate the strengths and weaknesses of different clustering approaches.

Algorithm Comparison Notes

K-Means

Assumes circular clusters and struggles with non-convex shapes. Works well when clusters are spherical and similar in size.

DBSCAN

Excellent for non-convex shapes and handles noise well. Requires tuning of epsilon and min_samples parameters.

OPTICS

Extension of DBSCAN that works with varying densities. Creates a reachability plot for cluster extraction.

Mean Shift

Finds clusters by shifting points towards modes of the data distribution. Automatically determines cluster count.

HDBSCAN

Hierarchical extension of DBSCAN that works well with varying densities and hierarchical cluster structures.

Clusters.KMeans

constructor (n_clusters: number = 2, opt_ratio: number = 0.05, initCenters?: number[][], max_iter: number = 30)

props name	type	default value
n_clusters	number	2
opt_ratio	number	0.05
initCenters	number[][]	undefined
max_iter	number	30

const X = [
    [0, 0],
    [0.5, 0],
    [0.5, 1],
    [1, 1],
];
const sampleWeights = [3, 1, 1, 3];
const initCenters = [[0, 0], [1, 1]];

const kmeans = new KMeans(2, 0.05, initCenters);

const result = kmeans.fitPredict(X, sampleWeights);

Clusters.DBScan

constructor(eps: number = 0.5, minSamples: number = 5, distanceType: Distance.IDistanceType = 'euclidiean')

fitPredict(samplesX: number[][]): number[] returns cluster labels for samples. Noise points are marked as -1.

const X = makeCircles(20, 20, 1, 5);
const dbscan = new DBScan(0.6, 3);
const labels = dbscan.fitPredict(X);

Clusters.HDBScan

constructor(
    min_cluster_size: number = 5,
    min_samples: number | null = null,
    cluster_selection_epsilon: number = 0.5,
    metric: Distance.IDistanceType = 'euclidiean'
)

fitPredict(samplesX: number[][]): number[] returns cluster labels. Noise points are marked as -1.

This is a simplified implementation that internally calls DBSCAN using cluster_selection_epsilon as the eps parameter.

const hdb = new HDBScan(5, null, 0.6);
const labels = hdb.fitPredict(X);

Clusters.MeanShift

constructor(
    bandwidth: number = 1,
    max_iter: number = 300,
    distanceType: Distance.IDistanceType = 'euclidiean'
)

Methods:

fitPredict(samplesX: number[][]): number[]
getCentroids(): number[][]

const ms = new MeanShift(2);
const labels = ms.fitPredict(X);
const centers = ms.getCentroids();

Clusters.OPTICS

interface OPTICSOptions {
    min_samples?: number;
    max_eps?: number;
    metric?: Distance.IDistanceType;
    p?: number;
    eps?: number;
}
constructor(options: OPTICSOptions = {})

fitPredict(samplesX: number[][]): number[] returns cluster labels. Noise points are marked as -1.

const optics = new OPTICS({ eps: 0.5, min_samples: 5 });
const labels = optics.fitPredict(X);

Clusters.kmeansPlusPlus

kmeansPlusPlus(
    X: number[][],
    n_clusters: number,
    sampleWeight?: number[],
    randomState: () => number = Math.random
): { centers: number[][]; indices: number[] }

This utility initializes cluster centers using the k-means++ strategy.

const { centers } = kmeansPlusPlus(X, 3);

Clusters

Clusters

Clustering Algorithms Comparison

Algorithm Comparison Notes

K-Means

DBSCAN

OPTICS

Mean Shift

HDBSCAN

Clusters.KMeans

Clusters.DBScan

Clusters.HDBScan

Clusters.MeanShift

Clusters.OPTICS

Clusters.kmeansPlusPlus

On this page