Clusters
API reference for Clusters
Clusters
Clustering Algorithms Comparison
Compare different clustering algorithms on classic datasets. The datasets shown here are commonly used to demonstrate the strengths and weaknesses of different clustering approaches.
Algorithm Comparison Notes
K-Means
Assumes circular clusters and struggles with non-convex shapes. Works well when clusters are spherical and similar in size.
DBSCAN
Excellent for non-convex shapes and handles noise well. Requires tuning of epsilon and min_samples parameters.
OPTICS
Extension of DBSCAN that works with varying densities. Creates a reachability plot for cluster extraction.
Mean Shift
Finds clusters by shifting points towards modes of the data distribution. Automatically determines cluster count.
HDBSCAN
Hierarchical extension of DBSCAN that works well with varying densities and hierarchical cluster structures.
Clusters.KMeans
constructor (n_clusters: number = 2, opt_ratio: number = 0.05, initCenters?: number[][], max_iter: number = 30)
props name | type | default value |
---|---|---|
n_clusters | number | 2 |
opt_ratio | number | 0.05 |
initCenters | number[][] | undefined |
max_iter | number | 30 |
const X = [
[0, 0],
[0.5, 0],
[0.5, 1],
[1, 1],
];
const sampleWeights = [3, 1, 1, 3];
const initCenters = [[0, 0], [1, 1]];
const kmeans = new KMeans(2, 0.05, initCenters);
const result = kmeans.fitPredict(X, sampleWeights);
Clusters.DBScan
constructor(eps: number = 0.5, minSamples: number = 5, distanceType: Distance.IDistanceType = 'euclidiean')
fitPredict(samplesX: number[][]): number[]
returns cluster labels for samples. Noise points are marked as -1
.
const X = makeCircles(20, 20, 1, 5);
const dbscan = new DBScan(0.6, 3);
const labels = dbscan.fitPredict(X);
Clusters.HDBScan
constructor(
min_cluster_size: number = 5,
min_samples: number | null = null,
cluster_selection_epsilon: number = 0.5,
metric: Distance.IDistanceType = 'euclidiean'
)
fitPredict(samplesX: number[][]): number[]
returns cluster labels. Noise points are marked as -1
.
This is a simplified implementation that internally calls DBSCAN using cluster_selection_epsilon
as the eps
parameter.
const hdb = new HDBScan(5, null, 0.6);
const labels = hdb.fitPredict(X);
Clusters.MeanShift
constructor(
bandwidth: number = 1,
max_iter: number = 300,
distanceType: Distance.IDistanceType = 'euclidiean'
)
Methods:
fitPredict(samplesX: number[][]): number[]
getCentroids(): number[][]
const ms = new MeanShift(2);
const labels = ms.fitPredict(X);
const centers = ms.getCentroids();
Clusters.OPTICS
interface OPTICSOptions {
min_samples?: number;
max_eps?: number;
metric?: Distance.IDistanceType;
p?: number;
eps?: number;
}
constructor(options: OPTICSOptions = {})
fitPredict(samplesX: number[][]): number[]
returns cluster labels. Noise points are marked as -1
.
const optics = new OPTICS({ eps: 0.5, min_samples: 5 });
const labels = optics.fitPredict(X);
Clusters.kmeansPlusPlus
kmeansPlusPlus(
X: number[][],
n_clusters: number,
sampleWeight?: number[],
randomState: () => number = Math.random
): { centers: number[][]; indices: number[] }
This utility initializes cluster centers using the k-means++ strategy.
const { centers } = kmeansPlusPlus(X, 3);