Add clustering evaluation with color toggle, ARI/NMI, and cluster analysis#23
Add clustering evaluation with color toggle, ARI/NMI, and cluster analysis#23NetZissou wants to merge 1 commit intofix/precalculated-tooltip-nullfrom
Conversation
…r analysis - Preserve KMeans labels in "Use column values" mode (no longer overwritten) - Color toggle: switch between KMeans clusters and ground truth column in plot - ARI/NMI metrics computed automatically for evaluation mode - Cluster analysis tree: full-width console output with per-cluster purity, entropy, and ranked breakdown by ground truth category - Cardinality cap at 16: blocks clustering with error when column has too many unique values to avoid misleading duplicate colors - Remove "Label by column" mode (redundant with color toggle) - Clean up unused dim-reduction-only methods from ClusteringService Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
It'd be great if we could color by other column values too -- maybe a dropdown? Let's use a color scheme with 20 colors, and just have a warning of color duplication, instead of limiting based on the distinct color limitation. |
|
Closing this PR in favor of a redesigned approach discussed in a follow-up meeting. What's changing: The current design couples dim reduction and KMeans into a single "Run Clustering" button with mode switches. The redesign separates them into independent operations:
This better reflects the actual data flow (both operations independently consume the same full-size embeddings) and makes the UI more intuitive. The evaluation features from this PR (ARI/NMI, cluster analysis tree, color comparison) will carry forward into the new implementation -- just triggered by the general color-by selector rather than a dedicated mode. PR #22 (bug fixes) remains valid and should merge first. |
Summary
Adds evaluation capabilities to the "Use column values" clustering mode, allowing users to assess how well KMeans clusters align with known categorical labels.
run_dim_reduction/run_dim_reduction_safefrom ClusteringServiceDepends on #22
Generated with Claude Code