-
Notifications
You must be signed in to change notification settings - Fork 17
Add Additional Metadata and Data Model Conversions #201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Additional Metadata and Data Model Conversions #201
Conversation
ea094be to
6fc88b6
Compare
6fc88b6 to
5623ebc
Compare
|
@eengl @AdamSchnapp This is now ready for review. |
in xarray dataset. Add optional data_model argument to translate coordinates and attributes to defined data model specification.
5623ebc to
2d0ad6f
Compare
fd94f59 to
283cd5c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will likely move this file into tables/
| # Data extracted from the markdown tables | ||
| # taken from CF conventions standard names table | ||
| # https://cfconventions.org/Data/cf-standard-names/current/build/cf-standard-name-table.html | ||
| data = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will likely reformat this into a dictionary where the key is the shortName and value is a dict with keys:
cf_standard_namecf_cell_method
So for example,
data = {
'ABSV': {'cf_standard_name': 'atmosphere_absolute_vorticity', 'cf_cell_method': None},
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm...wait...I see why you formatted as such...to get into a DataFrame.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct. I don't think we need to run this through pandas. I think a dict lookup would suffice if that is preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If anything, it would be consistent with the rest of the tables. Do you foresee the value expanding beyond just cf_standard_name and cf_cell_method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I do not see that expanding. cf_standard_name is the only required value, while cf_cell_method becomes relevant for certain elements (e.g. max temperature). I do not believe any additional fields would be relevant.
3de2508 to
458a548
Compare
f9af967
into
NOAA-MDL:200-duplicate-elements-with-open_datatree
* Fix for duplicate variables in xarray DataTree Added logic to make sure that GRIB2 metadata be of type int, not Grib2Message when adding these as separate columns to pandas df for datatree processing Added entries for 195, 196, 197 in level table for datatree. These are custom levels used by AWC (and thus NBM). [skip ci] * Significant Update for xarray_backend.py This commit allows for the backend to provide a common dimension name for GRIB2 messages where more than 1 attribute would provide the dimensionality. For example, probability messages (pdtn 5 or 9), these messages have the following attrs used as dimensions: thresholdLowerLimit, thresholdUpperLimit, but these are valid for 1 message, basically we are adding an extra dimension. So here we introduce a "threshold" dimension (and coordinate), and thresholdLowerLimit and thresholdUpperLimit serve as coordinates, dimensioned by "threshold". Also, this commit adds the ability to add another node for Xarray DataTrees by the typeOfProbability attribute, providing node name "prob_<typeOfProbability>". * Update xarray_backend.py Fixed bug from previous commit where for mfdatasets the dimension ordering was wrong when creating the DataArray. The fix is reversing ordered_meta in make_variables(). * Update xarray_backend.py Cleaned up *Dim class naming for DimCube. * Update xarray_backend.py For *Dim classes, using typing.Tuple[] instead of tuple[]. This maintains compatibility with Python 3.8. * Update for xarray_backend.py This commit comments out the call to create_dataset_from_df() and instead calls try_process_by_variables(). This leads to variables for a level/pdtn tree gettings their own var_<shortName> DataTree group. The next step here is to collect DataArrays where they have the same level value. For example, collect temps where 2m height above ground and winds where 10m height above ground. With these changes, all GRIB2 messages are resolved when reading a NBM Core F001 CONUS GRIB2 file. The multitude of diagnostic print statements are still present. * additions to pr:200 duplicate elements with open datatree (#203) * this appears to be a working checkpoint * revisions and cleanup for adding extra coordinates for dimensions that are not indexes; level, threshold --------- Co-authored-by: Adam.Schnapp <[email protected]> * Update templates.py Added check in __get__ for valueOfFirstFixedSurface and valueOfSecondFixedSurface to check for scale_factor < 0, to return 0.0 [skip ci] * 200 duplicate elements with open datatree (#204) * this appears to be a working checkpoint * revisions and cleanup for adding extra coordinates for dimensions that are not indexes; level, threshold * Update templates.py Added check in __get__ for valueOfFirstFixedSurface and valueOfSecondFixedSurface to check for scale_factor < 0, to return 0.0 [skip ci] * adjust a test for adjusted dimension/coordinate behavior and fix bug parsing grib index --------- Co-authored-by: Adam.Schnapp <[email protected]> Co-authored-by: Eric Engle <[email protected]> * Add Additional Metadata and Data Model Conversions (#201) * Add support for additional coordinate and metadata information in xarray dataset. Add optional data_model argument to translate coordinates and attributes to defined data model specification. * integrate aspects of the cf_metadata into the default data model * Move ptype threshold decoding to data model function * Fix string type conversion in metadata parsing * Add comment with reference to CF conventions standard names table * Fix error in ptype parsing and remove forced conversion to string for metadata values * Add else to rename all coordinate names * Change leadtime to lead_time per [email protected] --------- Co-authored-by: Adam.Schnapp <[email protected]> * Updating grib2io tables Added reworked CF tables NCEP GRIB2 tables updated to v35 [skip ci] * Updates to xarray backend. Moved table in vertical_coordinate_surfaces.py to inside xarray_backend.py. Code cleanup. [skip ci] * Update for tables and table generation scripts. [skip ci] * Update GRIB2 tables Updated make_grib2_tables/get-ncep-grib2-originating-centers.py to perform more explicit table scraping and encoding due to pandas read_html not properly encoding characters with accent marks. [skip ci] * Update for xarray_backend.py Changes here for the Xarray DataTree. As of this commit, all GRIB2 messages for a Core NBM GRIB2 files are accounted for. [skip ci] * Update xarray_backend.py Clean up of DataTree names. [skip ci] * Update tests due to NCEP GRIB2 table updates. [skip ci] * Update tests/test_xarray_datatree_backend.py Xarray DataTree .ds is not None ever, but will return a DatasetView of an empty Dataset when there is none. * Clean up xarray_backend.py Remove test prints. [skip ci] * More datatree updates [skip ci] * Update xarray_backend.py [skip ci] * Update tests for xarray DataTree backend Added file blend.t00z.core.f001.co_4x_reduce.grib2 which is a NBM Core forecast GRIB2 file for CONUS where the 2.5km grid has been rediced 4x to roughly 10km to make the filesize smaller, but all GRIB2 metadata are preserved. Added test to resolve all messages in the file. * Update DataTree tests Commented out some check due to changes to the DataTree structure. [skip ci] * Update tests. --------- Co-authored-by: Eric Engle <[email protected]> Co-authored-by: Adam Schnapp <[email protected]> Co-authored-by: Adam.Schnapp <[email protected]> Co-authored-by: TylerWixtrom-NOAA <[email protected]>
Adds and option in xarray engine to parse metadata following CF conventions. This is still a work in progress and needs unit tests and testing against multiple datasets.