Thanks to visit codestin.com
Credit goes to github.com

Skip to content

DOC Update _hdbscan/_linkage.pyx with new inline comments #25656

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 13 additions & 2 deletions sklearn/cluster/_hdbscan/_linkage.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ cpdef cnp.ndarray[MST_edge_t, ndim=1, mode='c'] mst_from_mutual_reachability(
-------
mst : ndarray of shape (n_samples - 1,), dtype=MST_edge_dtype
The MST representation of the mutual-reahability graph. The MST is
represented as a collecteion of edges.
represented as a collection of edges.
"""
cdef:
# Note: we utilize ndarray's over memory-views to make use of numpy
Expand All @@ -59,16 +59,28 @@ cpdef cnp.ndarray[MST_edge_t, ndim=1, mode='c'] mst_from_mutual_reachability(
mst = np.empty(n_samples - 1, dtype=MST_edge_dtype)
current_labels = np.arange(n_samples, dtype=np.int64)
current_node = 0
# Contains the minimum reachability of points to the built tree. This is
# iteratively updated with each node we add.
min_reachability = np.full(n_samples, fill_value=np.infty, dtype=np.float64)
for i in range(0, n_samples - 1):
# Sub-select nodes not-yet in the tree
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be communicate with a better variable name for label_filter?

label_filter = current_labels != current_node
current_labels = current_labels[label_filter]

# Compute the nodes' current min-reachability scores
left = min_reachability[label_filter]
# Compute the nodes' mutual-reachability to current node
right = mutual_reachability[current_node][current_labels]
Comment on lines +70 to 73
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left and right are not computing anything. Is there a better name for left and right such that the comment is not required?

# Update min-reachability, given the new mutual-reachability of all
# nodes from the current node.
Comment on lines +74 to +75
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also would a better name for left and right remove the need for this comment?

min_reachability = np.minimum(left, right)

# Find node with minimum-reachabiltiy
# Note we perform index-remapping via `current_labels` since it is a
# sub-selection and hence not 1-1
new_node_index = np.argmin(min_reachability)
new_node = current_labels[new_node_index]

mst[i].current_node = current_node
mst[i].next_node = new_node
mst[i].distance = min_reachability[new_node_index]
Expand Down Expand Up @@ -228,7 +240,6 @@ cpdef cnp.ndarray[cnp.float64_t, ndim=2, mode='c'] make_single_linkage(const MST

current_node_cluster = U.fast_find(current_node)
next_node_cluster = U.fast_find(next_node)

# TODO: Update this to an array of structs (AoS).
# Should be done simultaneously in _tree.pyx to ensure compatability.
single_linkage[i][0] = <cnp.float64_t> current_node_cluster
Expand Down