Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@Mikejmnez
Copy link
Collaborator

@Mikejmnez Mikejmnez commented Jun 2, 2025

The following Pull Request:

This addresses a performance issue when accessing via xarray+pydap ocean velocity from OSCAR. For that dataset latitude and longitude are named, whereas lat and lon respectively have the values (meaning one can request dap responses). In this scenario, xarray requests lat and lon instead of latitude and longitude. The requests grow as 2*N, N being the number of URLS.

This PR then enables caching for lat, lon or any other similar Map, when the set_maps is set to True (False by default).

Example

urls = ['dap4://opendap.earthdata.nasa.gov/collections/C2098858642-POCLOUD/granules/oscar_currents_final_20190816',
 'dap4://opendap.earthdata.nasa.gov/collections/C2098858642-POCLOUD/granules/oscar_currents_final_20190823', 
...
]
%%time
consolidate_metadata(urls, concat_dim='time', safe_mode=False, set_maps=True, session=my_session)

For 100 urls , consolidate_metadata takes ~40 secs. With that, the following takes 1 second:

Screenshot 2025-06-02 at 12 49 23 AM

Notes

  1. That the coordinates (e.g. maps) and dimensions do not match spatially.
  2. It is possible to bypass consolidate_metadata and create the xarray dataset directly. However, this is ~ 10x slower.
  3. There will be some need for more testing, but it works for my purpose

@Mikejmnez Mikejmnez merged commit 0e182e6 into main Jun 2, 2025
9 checks passed
@Mikejmnez Mikejmnez deleted the named_dim_consolidate branch June 12, 2025 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable consolidate_metadata check for named dimensions

2 participants