Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@Nischal1729
Copy link
Collaborator

@Nischal1729 Nischal1729 commented Sep 15, 2025

fix: for balanced kmeans use grid.x for adjust_centers to avoid grid.y overflow for large n_clusters

Problem: adjust_centers launched with grid.y = ceil(n_clusters/4) hits CUDA’s 65,535 Y-dim limit for n_clusters > 262,140 (e.g., 263k and 1M).

Fix: enumerate blocks along grid.x and compute l via blockIdx.x. No algorithmic or perf changes; only prevents the invalid configuration.

Repro: 262k works; 263k fails pre-fix; both work post-fix. 1M centroid training proceeds past balancing.

Impact: zero regression; removes a hard cap on n_clusters.

@Nischal1729 Nischal1729 added bug Something isn't working and removed bug Something isn't working labels Sep 15, 2025
@abhinavdangeti abhinavdangeti changed the title fix: for balanced kmeans use grid.x for adjust_centers to avoid grid.… fix: for balanced kmeans use grid.x for adjust_centers to avoid grid.y overflow Sep 16, 2025
@abhinavdangeti
Copy link
Member

@Nischal1729 It seems the original repository would benefit from this fix - I would recommend raising the same PR to rapidsai/cuvs for the authors to review as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants