Issue #1436 work around dask issue #1450

JoerivanEngelen · 2025-02-24T13:50:53Z

Description

Work around Dask issue by forcing a load of idomain and svat into memory. These grids will never be as large as the boundary condition grids with a time axis, so I think it is acceptable. At least for the LHM it is. I couldn't get it to work with a where: numpy only accepts 1D arrays on a boolean indexed 3D as long as the 1D array has the length of one of the dimensions, it then does what we want to do here. However when this is not the case, which is 95% of the time, numpy throws an error. I therefore had to resort to loading arrays into memory.

Furthermore I refactored test_coupler_mapping.py to be able to reduce some code duplication and add a test case with a dask array. For this I had to make the well package used consistent (From 2 wells to 3 wells, where the second well now lays on y=2.0 instead of y=1.0), so I updated some of the assertions.

Checklist

Links to correct issue
Update changelog, if changes affect users
PR title starts with Issue #nr, e.g. Issue #737
Unit tests were added
If feature added: Added/extended example

…nt amongst other tests.

sonarqubecloud · 2025-02-24T13:51:28Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

Manangka · 2025-02-25T08:13:52Z

imod/msw/grid_data.py

@@ -91,11 +91,14 @@ def generate_index_array(self):
        active = self.dataset["active"]

        isactive = area.where(active).notnull()
+        svat = xr.full_like(area, fill_value=0, dtype=np.int64).rename("svat")


I don't understand how this fixes the issue.
I looked at the github issue you linked to.
There the problematic statement seems to be a[mask] = a[mask].
In this case is that svat.data[isactive.data] = np.arange(1, index.sum() + 1)?

It seems that the bug is being recognized.
Wouldn't it be better to pin dask to an older version until this issue is resolved?

Dask has slightly different behavior here than numpy. I think we are threading into more obscure features of numpy, of which I'm not certain they are in the scope of Dask or not.

From what I read, the dask developer was surprised a[mask] = b worked in the first place. Which is what we do here. Furthermore, it seems 2025.1.0 also shows some weird behavior with masks. I therefore think it is safer to load these arrays into memory (turning them into regular numpy arrays) for now. That is less hassle then pinning to older versions, as that entails patching the conda-forge repopatches feedstock etc.

Just looked further into this, I think our issue is unrelated to the Dask issue. So what I did never worked with dask in the first place.

import numpy as np a = np.zeros((3,3)) mask = np.array( [ [True, True, False], [False, False, True], [False, True, False] ] ) n_count = np.sum(mask) a[mask] = np.arange(1, n_count+1) print(a) # With numpy it works! # %% import dask.array da = dask.array.from_array(a, chunks=(3,3)) dmask = dask.array.from_array(mask, chunks=(3,3)) da[dmask] = np.arange(1, n_count+1) # Error

Throws error:

ValueError: Boolean index assignment in Dask expects equally shaped arrays. Example: da1[da2] = da3 where da1.shape == (4,), da2.shape == (4,) and da3.shape == (4,). Alternatively, you can use the extended API that supportsindexing with tuples. Example: da1[(da2,)] = da3.

Therefore loading these specific grids into memory is an acceptable, and quickest solution.

JoerivanEngelen added 6 commits February 21, 2025 18:00

Start working on workaround dask issue

866f0d0

format

ba9811e

Load data into memory in render method, to work around dask issue.

11ff317

Update test assertions to match changed well fixture that is consiste…

dcaaad9

…nt amongst other tests.

Update changelog

1656203

Merge branch 'master' into issue_#1436_work_around_dask_issue

f7e5548

JoerivanEngelen requested a review from Manangka February 24, 2025 13:54

Manangka reviewed Feb 25, 2025

View reviewed changes

Manangka approved these changes Feb 25, 2025

View reviewed changes

JoerivanEngelen added this pull request to the merge queue Feb 25, 2025

Merged via the queue into master with commit 6fc39dd Feb 25, 2025
7 checks passed

JoerivanEngelen deleted the issue_#1436_work_around_dask_issue branch February 25, 2025 12:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Issue #1436 work around dask issue #1450

Issue #1436 work around dask issue #1450

Uh oh!

JoerivanEngelen commented Feb 24, 2025 •

edited

Loading

Uh oh!

sonarqubecloud bot commented Feb 24, 2025

Uh oh!

Manangka Feb 25, 2025

Uh oh!

JoerivanEngelen Feb 25, 2025

Uh oh!

JoerivanEngelen Feb 25, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Issue #1436 work around dask issue #1450

Issue #1436 work around dask issue #1450

Uh oh!

Conversation

JoerivanEngelen commented Feb 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

sonarqubecloud bot commented Feb 24, 2025

Quality Gate passed

Uh oh!

Manangka Feb 25, 2025

Choose a reason for hiding this comment

Uh oh!

JoerivanEngelen Feb 25, 2025

Choose a reason for hiding this comment

Uh oh!

JoerivanEngelen Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

JoerivanEngelen commented Feb 24, 2025 •

edited

Loading

JoerivanEngelen Feb 25, 2025 •

edited

Loading