Clusters
Learn how to use Clusters algorithms in @kanaries/ml for JavaScript and TypeScript machine learning projects.
Clusters
Clustering Algorithms Comparison
Compare different clustering algorithms on classic datasets. The datasets shown here are commonly used to demonstrate the strengths and weaknesses of different clustering approaches.
Algorithm Comparison Notes
K-Means
Assumes circular clusters and struggles with non-convex shapes. Works well when clusters are spherical and similar in size.
DBSCAN
Excellent for non-convex shapes and handles noise well. Requires tuning of epsilon and min_samples parameters.
OPTICS
Extension of DBSCAN that works with varying densities. Creates a reachability plot for cluster extraction.
Mean Shift
Finds clusters by shifting points towards modes of the data distribution. Automatically determines cluster count.
HDBSCAN
Hierarchical extension of DBSCAN that works well with varying densities and hierarchical cluster structures.
Clusters.KMeans
constructor (n_clusters: number = 2, opt_ratio: number = 0.05, initCenters?: number[][], max_iter: number = 30)| props name | type | default value |
|---|---|---|
| n_clusters | number | 2 |
| opt_ratio | number | 0.05 |
| initCenters | number[][] | undefined |
| max_iter | number | 30 |
const X = [
[0, 0],
[0.5, 0],
[0.5, 1],
[1, 1],
];
const sampleWeights = [3, 1, 1, 3];
const initCenters = [[0, 0], [1, 1]];
const kmeans = new KMeans(2, 0.05, initCenters);
const result = kmeans.fitPredict(X, sampleWeights);Clusters.DBScan
constructor(eps: number = 0.5, minSamples: number = 5, distanceType: Distance.IDistanceType = 'euclidiean')fitPredict(samplesX: number[][]): number[] returns cluster labels for samples. Noise points are marked as -1.
const X = makeCircles(20, 20, 1, 5);
const dbscan = new DBScan(0.6, 3);
const labels = dbscan.fitPredict(X);Clusters.HDBScan
constructor(
min_cluster_size: number = 5,
min_samples: number | null = null,
cluster_selection_epsilon: number = 0.5,
metric: Distance.IDistanceType = 'euclidiean'
)fitPredict(samplesX: number[][]): number[] returns cluster labels. Noise points are marked as -1.
This is a simplified implementation that internally calls DBSCAN using cluster_selection_epsilon as the eps parameter.
const hdb = new HDBScan(5, null, 0.6);
const labels = hdb.fitPredict(X);Clusters.MeanShift
constructor(
bandwidth: number = 1,
max_iter: number = 300,
distanceType: Distance.IDistanceType = 'euclidiean'
)Methods:
fitPredict(samplesX: number[][]): number[]getCentroids(): number[][]
const ms = new MeanShift(2);
const labels = ms.fitPredict(X);
const centers = ms.getCentroids();Clusters.OPTICS
interface OPTICSOptions {
min_samples?: number;
max_eps?: number;
metric?: Distance.IDistanceType;
p?: number;
eps?: number;
}
constructor(options: OPTICSOptions = {})fitPredict(samplesX: number[][]): number[] returns cluster labels. Noise points are marked as -1.
const optics = new OPTICS({ eps: 0.5, min_samples: 5 });
const labels = optics.fitPredict(X);Clusters.kmeansPlusPlus
kmeansPlusPlus(
X: number[][],
n_clusters: number,
sampleWeight?: number[],
randomState: () => number = Math.random
): { centers: number[][]; indices: number[] }This utility initializes cluster centers using the k-means++ strategy.
const { centers } = kmeansPlusPlus(X, 3);How to use the Clusters module in real projects
The Clusters module groups unlabeled data so you can segment users, sessions, or events without requiring ground-truth labels.
Selection checklist
- Use KMeans when you expect compact cluster centers and can predefine cluster count.
- Use density-based methods (DBSCAN, OPTICS, HDBScan) when clusters have irregular shapes or noisy outliers.
- Validate with silhouette scores or downstream business metrics such as campaign lift by segment.
Common implementation workflow
- Start from a simple baseline in this module and evaluate on a holdout split.
- Compare at least one alternative algorithm from this module before locking production defaults.
- Pair model quality metrics with runtime constraints (latency, memory, bundle size).
Common search intents
javascript clustering librarykmeans javascript tutorialdensity clustering in typescript
Explore algorithms in this module
FAQ
What problem does Clusters solve in JavaScript machine learning projects?
Clusters helps teams implement production-ready ML workflows in browser and Node.js environments with a familiar scikit-learn-style API.
When should I choose Clusters instead of other Clusters algorithms?
Use Clusters when it best matches your data shape, labeling strategy, and runtime constraints. Benchmark against at least one alternative in the same module before finalizing defaults.
Can I run Clusters in both browser and Node.js with @kanaries/ml?
Yes. @kanaries/ml is designed for JavaScript and TypeScript runtimes across browser applications, server-side Node.js services, and edge-friendly workflows.