Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Deprecate copy in Birch #29092

Closed
Closed
@jeremiedbb

Description

@jeremiedbb

Birch doesn't perform inplace operations (at least not on the input array), so the copy parameter is useless and should be deprecated. It's even detrimental because by default it makes a copy.

The only place where an inplace operation happens is in the update method of _CFSubcluster:

def update(self, subcluster):
self.n_samples_ += subcluster.n_samples_
self.linear_sum_ += subcluster.linear_sum_
self.squared_sum_ += subcluster.squared_sum_
self.centroid_ = self.linear_sum_ / self.n_samples_
self.sq_norm_ = np.dot(self.centroid_, self.centroid_)

However, update is call in 2 places. The first one is in the _split_node function, but here we first create 2 new _CFSubcluster objects and so the update performs inplace operations on newly created data, so the input data is not modified. The second one is in the insert_cf_subcluster method of _CFNode but is only triggered if the subcluster has a child, which can only come from splitted subclusters (i.e. after _split_node), so again we're not modifying the input data.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions