Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@ilumsden
Copy link
Collaborator

@ilumsden ilumsden commented Jun 6, 2025

Follow up to #140.

This PR knocks out the last few minor changes that needed to be made to support NumPy >= 2.0. It also changes requirements.txt and setup.py to remove the strict version limit of < 2.0.

@ilumsden ilumsden self-assigned this Jun 6, 2025
@ilumsden ilumsden added area-ci Issues and PRs related to continuous integration processes used by Hatchet developers area-deployment Issues and PRs involving Hatchet's packaging and deployment priority-normal Normal priority issues and PRs status-ready-for-review This PR is ready to be reviewed by assigned reviewers type-internal-cleanup PR or issues related to the structure of the codebase, directories and refactors status-work-in-progress PR is currently being worked on and removed status-ready-for-review This PR is ready to be reviewed by assigned reviewers labels Jun 6, 2025
@ilumsden
Copy link
Collaborator Author

ilumsden commented Jun 6, 2025

This seems to be working, with the exception of GraphFrame.to_hdf in Python 3.9. I'm not sure why that specific CI run is failing.

@ilumsden
Copy link
Collaborator Author

ilumsden commented Jun 6, 2025

I've figured out the issue here.

The issue starts with PyTables, the library Pandas uses under the hood for HDF support (and which we indirectly use for the GraphFrame.to_hdf and GraphFrame.from_hdf methods). PyTables did not add NumPy 2.0 support until version 3.10, which coincidentally only supports Python 3.10 or higher. So, when running with Python 3.9 (where the CI error occurs), we are forced to run with a version of PyTables that does not support NumPy 2.0, and, as a result, we end up with an error.

I think this spawns a larger discussion about whether we want to continue supporting GraphFrame.to_hdf and GraphFrame.from_hdf or not. Are we even using it? Especially with Thicket?

@ilumsden
Copy link
Collaborator Author

ilumsden commented Jun 6, 2025

After discussing with @michaelmckinsey1, we decided to do the following:

  1. Officially deprecate the HDF reader/writer (since no one uses it and it is not supported in Thicket)
  2. Catch errors from the reader/writer and, if needed, add context to warn that the issue is likely due to version mismatches between NumPy, Pandas, and/or PyTables

Copy link
Collaborator

@michaelmckinsey1 michaelmckinsey1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@slabasan slabasan merged commit 2c23ad2 into LLNL:develop Jun 6, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-ci Issues and PRs related to continuous integration processes used by Hatchet developers area-deployment Issues and PRs involving Hatchet's packaging and deployment priority-normal Normal priority issues and PRs status-work-in-progress PR is currently being worked on type-internal-cleanup PR or issues related to the structure of the codebase, directories and refactors

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants