Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@batcity
Copy link

@batcity batcity commented Oct 17, 2025

  • Closes 11643
  • Tests added / passed

Tests for the bags module pass:

image

Tests for the utils module pass:

image
  • Passes pre-commit run --all-files
image

How did I verify that the fix works:

I recreated the test mentioned in the issue above, here's the code:

 if __name__ == "__main__":
    import dask.bag as db
    import numpy as np
    from dask import delayed
    from scipy.sparse import csr_array


    def add(x, y):
        return x + y


    @delayed
    def create_sparse_array_delayed():
        return csr_array(np.random.random((10, 10)))


    @delayed
    def create_array_delayed():
        return np.random.random((10, 10))


    db.from_sequence(
        [csr_array(np.random.random((10, 10))), csr_array(np.random.random((10, 10)))]).fold(
        add).compute()  # works with sparse arrays when created from sequence
    db.from_delayed([create_array_delayed(), create_array_delayed()]).fold(add).compute()  # works with numpy arrays
    print(db.from_delayed([create_sparse_array_delayed(), create_sparse_array_delayed()]).fold(add).compute()) 

This now returns the result instead of the bug:

image

@github-actions
Copy link
Contributor

github-actions bot commented Oct 17, 2025

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

      9 files  ±  0        9 suites  ±0   3h 15m 55s ⏱️ + 4m 49s
 18 141 tests + 15   16 926 ✅ + 15   1 215 💤 ±0  0 ❌ ±0 
162 497 runs  +135  150 419 ✅ +135  12 078 💤 ±0  0 ❌ ±0 

Results for commit 24e0a92. ± Comparison against base commit b7ba831.

♻️ This comment has been updated with latest results.

jacobtomlinson
jacobtomlinson previously approved these changes Oct 23, 2025
Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that our CI environment has scipy installed could you add some tests that use some actual sparse arrays, rather than mocking everything?

@jacobtomlinson jacobtomlinson dismissed their stale review October 23, 2025 11:26

Accidentally clicked approve

Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine to me, but I'd appreciate an additional review from another maintainer before merging this.

cc @TomAugspurger @jrbourbeau @quasiben if you're around

@batcity
Copy link
Author

batcity commented Oct 23, 2025

The latest test failures are unrelated to this PR btw

dask/utils.py Outdated
Comment on lines 2361 to 2365
# Sparse-like objects
if hasattr(obj, "nnz"):
return obj.nnz == 0
if hasattr(obj, "shape"):
return 0 in obj.shape
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike that these can fail with an exception. We don't know that just because an object has .nnz that it's comparable to an int, and we don't know that obj.shape is a sequence. I'd feel better if these also catch exceptions (probably broad ones) so that we get through to the fallback.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, Fixed in commit: b3d818d

Copy link
Member

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm always looking for ways to reduce the amount of guessing / duck typing we do, but I'm not sure if there's a way around that here. Just one comment about a couple of checks in is_empty that might fail. Otherwise I think this is fine.

Comment on lines +2362 to +2366
if hasattr(obj, "nnz"):
try:
return obj.nnz == 0
except Exception:
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if contextlib.suppress would be a more readable solution for these? Or is that too magic? @TomAugspurger

Suggested change
if hasattr(obj, "nnz"):
try:
return obj.nnz == 0
except Exception:
pass
with contextlib.suppress(Exception):
return obj.nnz == 0

Copy link
Author

@batcity batcity Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 , this does seem cleaner since we're only suppressing the exception vs trying to recover from it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants