Thanks to visit codestin.com
Credit goes to GitHub.com

Skip to content

Disappearing coordinate attributes in model dataset #499

@OnnoEbbens

Description

@OnnoEbbens

I have a dataset with a time coordinate that has the attributes time_units and start. Then I add some DataArray or update the dataset and these attributes are removed from the dataset. This causes problems later on because when creating the 'tdis' I need these attribute values.

TLDR: When performing operations on DataArrays there is a risk of loosing the attributes. It would be nice (but also pretty annoying) if we try to implement all operations in nlmod in such a way that we do not loose the attributes. Or even better, wait for this issue: pydata/xarray#2245 to be solved.

I ran into this issue a couple of times before so I tried to make an overview of when this happens (and hopefully how to control it).

I used the 'basic_model' example from the documentation for this and show the effect of 3 operations:

  1. Adding the heads to the dataset makes me loose the 'name' and 'description' attributes.
  2. Using xr.where to create a dataarray without a time dimension won't change the attributes.
  3. Using xr.where to create a dataarray with a time dimension removes the attributes completely.
import xarray as xr
print(ds.time.attrs)
#1 add head
ds["head"] = nlmod.gwf.output.get_heads_da(ds)
print(ds.time.attrs)
#2 xr where without time dim
ds["strt_mv"] = xr.where(ds["top"] < ds["starting_head"], 0, ds["top"] - ds["starting_head"])
print(ds.time.attrs)
#3 xr where with time dim
ds["head_mv"] = xr.where(ds["top"] < ds["head"], 0, ds["top"] - ds["head"])
print(ds.time.attrs)

yields:

>>> {'name': 'Time', 'description': 'End time of the stress period', 'time_units': 'DAYS', 'start': '2015-01-01 00:00:00'}
>>> {'time_units': 'DAYS', 'start': '2015-01-01 00:00:00'}
>>> {'time_units': 'DAYS', 'start': '2015-01-01 00:00:00'}
>>> {}

According to the rioxarray docs you can use xarray.set_options(keep_attrs=True) to preserve the attributes.

There has also been some discussions on the xarray github on the subject of the keep_attrs=True settings:

Interesting to see in this discussion the reason why the crs is added as a coordinate and not an attribute.

For me setting keep_attrs=True did not work:

import xarray as xr
print(ds.time.attrs)
with xr.set_options(keep_attrs=True):
    ds["head"] = nlmod.gwf.output.get_heads_da(ds)
    print(ds.time.attrs)
    ds["strt_mv"] = xr.where(ds["top"] < ds["starting_head"], 0, ds["top"] - ds["starting_head"])
    print(ds.time.attrs)
    ds["head_mv"] = xr.where(ds["top"] < ds["head"], 0, ds["top"] - ds["head"])
    print(ds.time.attrs)

yields the same results

>>> {'name': 'Time', 'description': 'End time of the stress period', 'time_units': 'DAYS', 'start': '2015-01-01 00:00:00'}
>>> {'time_units': 'DAYS', 'start': '2015-01-01 00:00:00'}
>>> {'time_units': 'DAYS', 'start': '2015-01-01 00:00:00'}
>>> {}

Finally I found a solution. Using xr.set_options(keep_attrs=True) and xr.where will preserve the attributes from the 2nd argument of the function (0 in this case). If I swap the order I do get to keep the attributes.

import xarray as xr

print(ds.time.attrs)
with xr.set_options(keep_attrs=True):
    ds["head"] = nlmod.gwf.output.get_heads_da(ds)
    print(ds.time.attrs)
    ds["strt_mv"] = xr.where(ds["top"] < ds["starting_head"], 0, ds["top"] - ds["starting_head"])
    print(ds.time.attrs)
    ds["head_mv"] = xr.where(ds["top"] > ds["head"], ds["top"] - ds["head"], 0)
    print(ds.time.attrs)

yields

>>> {'name': 'Time', 'description': 'End time of the stress period', 'time_units': 'DAYS', 'start': '2015-01-01 00:00:00'}
>>> {'time_units': 'DAYS', 'start': '2015-01-01 00:00:00'}
>>> {'time_units': 'DAYS', 'start': '2015-01-01 00:00:00'}
>>> {'time_units': 'DAYS', 'start': '2015-01-01 00:00:00'}

For this case I found a solution but I can image there are many more cases where this can occur. It would be nice (but also pretty annoying) if we try to implement all operations in nlmod in such a way that we do not loose the attributes of the coordinates.

Maybe even better would be to preserve the attributes of the original dataset when adding a new DataArray. There is an open issue on the xarray github for this:
pydata/xarray#2245

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions