@kanaries/ml

Clusters

Learn how to use Clusters algorithms in @kanaries/ml for JavaScript and TypeScript machine learning projects.

Clusters

Clustering Algorithms Comparison

Compare different clustering algorithms on classic datasets. The datasets shown here are commonly used to demonstrate the strengths and weaknesses of different clustering approaches.

Algorithm Comparison Notes

K-Means

Assumes circular clusters and struggles with non-convex shapes. Works well when clusters are spherical and similar in size.

DBSCAN

Excellent for non-convex shapes and handles noise well. Requires tuning of epsilon and min_samples parameters.

OPTICS

Extension of DBSCAN that works with varying densities. Creates a reachability plot for cluster extraction.

Mean Shift

Finds clusters by shifting points towards modes of the data distribution. Automatically determines cluster count.

HDBSCAN

Hierarchical extension of DBSCAN that works well with varying densities and hierarchical cluster structures.

Clusters.KMeans

constructor (n_clusters: number = 2, opt_ratio: number = 0.05, initCenters?: number[][], max_iter: number = 30)
props nametypedefault value
n_clustersnumber2
opt_rationumber0.05
initCentersnumber[][]undefined
max_iternumber30
const X = [
    [0, 0],
    [0.5, 0],
    [0.5, 1],
    [1, 1],
];
const sampleWeights = [3, 1, 1, 3];
const initCenters = [[0, 0], [1, 1]];

const kmeans = new KMeans(2, 0.05, initCenters);

const result = kmeans.fitPredict(X, sampleWeights);

Clusters.DBScan

constructor(eps: number = 0.5, minSamples: number = 5, distanceType: Distance.IDistanceType = 'euclidiean')

fitPredict(samplesX: number[][]): number[] returns cluster labels for samples. Noise points are marked as -1.

const X = makeCircles(20, 20, 1, 5);
const dbscan = new DBScan(0.6, 3);
const labels = dbscan.fitPredict(X);

Clusters.HDBScan

constructor(
    min_cluster_size: number = 5,
    min_samples: number | null = null,
    cluster_selection_epsilon: number = 0.5,
    metric: Distance.IDistanceType = 'euclidiean'
)

fitPredict(samplesX: number[][]): number[] returns cluster labels. Noise points are marked as -1.

This is a simplified implementation that internally calls DBSCAN using cluster_selection_epsilon as the eps parameter.

const hdb = new HDBScan(5, null, 0.6);
const labels = hdb.fitPredict(X);

Clusters.MeanShift

constructor(
    bandwidth: number = 1,
    max_iter: number = 300,
    distanceType: Distance.IDistanceType = 'euclidiean'
)

Methods:

  • fitPredict(samplesX: number[][]): number[]
  • getCentroids(): number[][]
const ms = new MeanShift(2);
const labels = ms.fitPredict(X);
const centers = ms.getCentroids();

Clusters.OPTICS

interface OPTICSOptions {
    min_samples?: number;
    max_eps?: number;
    metric?: Distance.IDistanceType;
    p?: number;
    eps?: number;
}
constructor(options: OPTICSOptions = {})

fitPredict(samplesX: number[][]): number[] returns cluster labels. Noise points are marked as -1.

const optics = new OPTICS({ eps: 0.5, min_samples: 5 });
const labels = optics.fitPredict(X);

Clusters.kmeansPlusPlus

kmeansPlusPlus(
    X: number[][],
    n_clusters: number,
    sampleWeight?: number[],
    randomState: () => number = Math.random
): { centers: number[][]; indices: number[] }

This utility initializes cluster centers using the k-means++ strategy.

const { centers } = kmeansPlusPlus(X, 3);

How to use the Clusters module in real projects

The Clusters module groups unlabeled data so you can segment users, sessions, or events without requiring ground-truth labels.

Selection checklist

  1. Use KMeans when you expect compact cluster centers and can predefine cluster count.
  2. Use density-based methods (DBSCAN, OPTICS, HDBScan) when clusters have irregular shapes or noisy outliers.
  3. Validate with silhouette scores or downstream business metrics such as campaign lift by segment.

Common implementation workflow

  1. Start from a simple baseline in this module and evaluate on a holdout split.
  2. Compare at least one alternative algorithm from this module before locking production defaults.
  3. Pair model quality metrics with runtime constraints (latency, memory, bundle size).

Common search intents

  • javascript clustering library
  • kmeans javascript tutorial
  • density clustering in typescript

Explore algorithms in this module

FAQ

What problem does Clusters solve in JavaScript machine learning projects?

Clusters helps teams implement production-ready ML workflows in browser and Node.js environments with a familiar scikit-learn-style API.

When should I choose Clusters instead of other Clusters algorithms?

Use Clusters when it best matches your data shape, labeling strategy, and runtime constraints. Benchmark against at least one alternative in the same module before finalizing defaults.

Can I run Clusters in both browser and Node.js with @kanaries/ml?

Yes. @kanaries/ml is designed for JavaScript and TypeScript runtimes across browser applications, server-side Node.js services, and edge-friendly workflows.