Module

pycaret.plots.clustering

Plotly-native clustering diagnostics.

Functions:

  • elbow_curve — sum-of-squared-distances vs. k for KMeans-family models.
  • silhouette_curve — silhouette score vs. k.
  • silhouette_plot — per-sample silhouette for the chosen model.
  • cluster_distribution — bar of cluster sizes.
  • embedding_2d — t-SNE / UMAP / PCA scatter colored by cluster.

The screen-side composition: the "Clustering" tab in the Run-detail view shows elbow + silhouette side-by-side for tuning k, then embedding_2d for the chosen model.

Functions 5

elbow_curve(estimator: Any, X: Any, k_range: range | list[int] | None = None, title: str | None = 'Elbow curve') -> go.Figure

Sum-of-squared-distances vs. k. The "elbow" is where adding clusters stops meaningfully reducing inertia — pick k there.

Works on any clusterer that accepts n_clusters and exposes inertia_ after fit (KMeans-family).

silhouette_curve(estimator: Any, X: Any, k_range: range | list[int] | None = None, sample: int | None = 1000, random_state: int = 0, title: str | None = 'Silhouette score vs. k') -> go.Figure

Mean silhouette score vs. k. Higher is better; ranges -1..1.

silhouette_plot(estimator: Any, X: Any, sample: int | None = 1000, random_state: int = 0, title: str | None = 'Per-cluster silhouette') -> go.Figure

Per-sample silhouette grouped by cluster label.

A good clustering shows similar widths across clusters and high average silhouette. Negative silhouettes indicate likely misclassification.

cluster_distribution(estimator: Any, X: Any | None = None, title: str | None = 'Cluster sizes') -> go.Figure

Bar chart of how many samples fall into each cluster.

embedding_2d(estimator: Any, X: Any, method: str = 'pca', sample: int | None = 1500, random_state: int = 0, title: str | None = None) -> go.Figure

2-D embedding of X colored by the clusterer's labels.

Methods:

  • pca — sklearn PCA (fast, deterministic, no extra dep).
  • tsne — sklearn TSNE (slower, captures non-linear structure).
  • umap — only if umap-learn is installed.