Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[DOC] Improve documentation of DBSCAN memory use #28493

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 21, 2024

Conversation

kno10
Copy link
Contributor

@kno10 kno10 commented Feb 21, 2024

Original DBSCAN only queries one point at a time.
It is a scikit-learn limitation that the bulk query may use quadratic memory.

A better documentation of the memory is already found below, in the Notes:

This implementation bulk-computes all neighborhood queries, which increases
the memory complexity to O(n.d) where d is the average number of neighbors,
while original DBSCAN had memory complexity O(n). It may attract a higher
memory complexity when querying these nearest neighborhoods, depending
on the ``algorithm``.

Funnily, the incorrect "DBSCAN needs quadratic memory" claim was introduced later, in #26783

Original DBSCAN only queries one point at a time.
It is a scikit-learn limitation that the bulk query may use quadratic memory.
Copy link

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: f836eac. Link to the linter CI: here

Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @kno10

@jeremiedbb jeremiedbb merged commit e318019 into scikit-learn:main Feb 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants