Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@srherbener
Copy link
Contributor

@srherbener srherbener commented Jul 28, 2025

Description:

This PR contains a python based converter that transfers obs data from a JEDI IODA file to a DART obs_seq.out file. The converter is working on a simple radiosonde test case (included) and represents a good starting point for developing conversion for different obs types (aircraft, radiance, etc.).

Note that we haven't finalized what to do with the input ioda radiosonde test file (netcdf) yet, so at this time it hasn't been included in this PR.

Fixes issue

Partially addresses jcsda-internal/ioda/issues/1499

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update

Documentation changes needed?

  • My change requires a change to the documentation.
    • I have updated the documentation accordingly.

Tests

I have run several manual checks and tests. For example, the converter produces an obs_seq.out file that the obs_sequence_tool can properly read. Here is the print output from the obs_sequence tool using the file produced by the converter in the included test:

--------------------------------------
 Starting ... at YYYY MM DD HH MM SS = 
                 2025  7 28 10 10 41
 Program obs_sequence_tool
 --------------------------------------

  set_nml_output Echo NML values to log file only
 
 Assimilate_these_obs_types:
    RADIOSONDE_TEMPERATURE
    RADIOSONDE_U_WIND_COMPONENT
    RADIOSONDE_V_WIND_COMPONENT
    RADIOSONDE_SPECIFIC_HUMIDITY
    GPSRO_REFRACTIVITY
    LAND_SFC_ALTIMETER
    MARINE_SFC_DEWPOINT
    AIRCRAFT_U_WIND_COMPONENT
    AIRCRAFT_V_WIND_COMPONENT
    AIRCRAFT_TEMPERATURE
    ACARS_U_WIND_COMPONENT
    ACARS_V_WIND_COMPONENT
    ACARS_TEMPERATURE
    SAT_U_WIND_COMPONENT
    SAT_V_WIND_COMPONENT
 Evaluate_these_obs_types:
    none
 Use the precomputed Prior Forward Operators for these obs types:
    none
 
  location_mod: using code with optimized cutoffs
  location_mod: Including vertical separation when computing distances:
  location_mod:        # pascals ~ 1 horiz radian:      100000.00000
  location_mod:         # meters ~ 1 horiz radian:       30000.00000
  location_mod:   # model levels ~ 1 horiz radian:          20.00000
  location_mod:  # scale heights ~ 1 horiz radian:           2.00000
 
  obs_sequence_tool  Starting to process input sequence file obs_seq.radiosonde.ioda2obsq.nonan.out
 
   Processing sequence file obs_seq.radiosonde.ioda2obsq.nonan.out
   Data Metadata: observation
     QC Metadata: Data QC
  First obs time: day=152409, sec=83951
   Gregorian day: 2018 Apr 14 23:19:11
   Last obs time: day=152410, sec=7125
   Gregorian day: 2018 Apr 15 01:58:45
   Number of obs processed  :                   169
   ---------------------------------------------------------
       RADIOSONDE_U_WIND_COMPONENT      57 obs
       RADIOSONDE_V_WIND_COMPONENT      57 obs
            RADIOSONDE_TEMPERATURE      30 obs
      RADIOSONDE_SPECIFIC_HUMIDITY      25 obs
 
  obs_sequence_tool  Starting to process output sequence file obs_seq.processed
 Total number of selected obs in all files :         169
 
   Processing sequence file obs_seq.processed
   Data Metadata: observation
     QC Metadata: Data QC
  First obs time: day=152409, sec=83951
   Gregorian day: 2018 Apr 14 23:19:11
   Last obs time: day=152410, sec=7125
   Gregorian day: 2018 Apr 15 01:58:45
   Number of obs processed  :                   169
   ---------------------------------------------------------
       RADIOSONDE_U_WIND_COMPONENT      57 obs
       RADIOSONDE_V_WIND_COMPONENT      57 obs
            RADIOSONDE_TEMPERATURE      30 obs
      RADIOSONDE_SPECIFIC_HUMIDITY      25 obs
 
   Output sequence file not created; print_only in namelist is .true.
  obs_sequence_tool Finished successfully.

 --------------------------------------
 Finished ... at YYYY MM DD HH MM SS = 
                 2025  7 28 10 10 41
 --------------------------------------

Note that some data, such as the timestamp information, appears to have been transferred correctly. The test input ioda file contains radiosonde data for the time window: Apr, 14, 2018 21Z to Apr, 15, 2018, 03Z.

More checking is needed but this is a good starting point for further converter development.

Checklist for merging

  • Updated changelog entry
  • Documentation updated
  • Update conf.py

Checklist for release

  • Merge into main
  • Create release from the main branch with appropriate tag
  • Delete feature-branch

Testing Datasets

  • Dataset needed for testing available upon request
  • Dataset download instructions included
  • No dataset needed

…on which reads a ioda file into a pandas dataframe.
…tructs a pyDARTdiags

ObsSequence object from a pandas dataframe that is in the obs_seq layout.
…king, iodaDF to obsqDF conversion is in progress
…produce a file. Next step is to replace hard-coded sections with generic, configurable code.
… the dataframe filter that removes rows according to columns with missing data (nan's).
@fcvdb
Copy link

fcvdb commented Jul 29, 2025

I built the converters with following commands:

  1. Load the JEDI Intel environment with derecho_setup_intel.sh

  2. Clone DART

git clone https://github.com/NCAR/DART DART

  1. Check out the feature/py-ioda-reader branch from the fork:
git remote add forked https://github.com/srherbener/DART.git
git fetch forked
git checkout feature/py-ioda-reader
  1. Create a virtual environment (per DART readthedocs):
  cd DART
  python3 -m venv py-dart
  source py-dart/bin/activate
  cd pytools
  pip install -r pyDART.txt
  1. Go to the pytools/pyjedi/tests directory

cd pytools/pyjedi/tests

  1. Convert the NCAR RDA prepbufr /glade/campaign/collections/rda/data/d337000/prepnr/2021/prepbufr.gdas.2021021600.nr file for 2021/02/16 00Z into IODA using the attached convert_bufr.sh script.

  2. Convert the IODA radiosonde files into an obs sequence file with the command:

ioda2obsq obsq.radiosonde.yaml ioda.radiosonde_obs_2021021600.nc4 obs_seq.radiosonde_obs_2021021600.out

convert_bufr.txt

@hkershaw-brown
Copy link
Member

This is really nice Steve, I've been running it on various conventional obs.
ioda.aircraft_obs_2021021600.nc4, ioda.ascat_obs_2021021600.nc4, ioda.radiosonde_obs_2021021600.nc4 ioda.satwind_obs_2021021600.nc4

@srherbener @fcvdb
I've got couple of questions on the MetaData. Looking at the ioda files MetaData height and pressure are available.

so you could switch out in the obsq.radiosonde.yaml, height for pressure if you wanted the vertical location of the obs in pressure? e.g.

  vertical coordinate:
    name: MetaData/pressure
     units: "pressure (Pa)"

I can do this and get an obs sequence out, but I am I interpreting the ioda file correctly? Some observations have a vertical coordinate in pressure, some in height (some maybe both?)

INFO: Converted 127651 observations -- this is with vertical coordinate pressure
INFO: Converted 35596 observations -- this is with vertical coordinate height

@srherbener
Copy link
Contributor Author

This is really nice Steve, I've been running it on various conventional obs. ioda.aircraft_obs_2021021600.nc4, ioda.ascat_obs_2021021600.nc4, ioda.radiosonde_obs_2021021600.nc4 ioda.satwind_obs_2021021600.nc4

@srherbener @fcvdb I've got couple of questions on the MetaData. Looking at the ioda files MetaData height and pressure are available.

so you could switch out in the obsq.radiosonde.yaml, height for pressure if you wanted the vertical location of the obs in pressure? e.g.

  vertical coordinate:
    name: MetaData/pressure
     units: "pressure (Pa)"

I can do this and get an obs sequence out, but I am I interpreting the ioda file correctly? Some observations have a vertical coordinate in pressure, some in height (some maybe both?)

INFO: Converted 127651 observations -- this is with vertical coordinate pressure INFO: Converted 35596 observations -- this is with vertical coordinate height

I'm not sure why the prepBUFR to IODA converter is writing out both height and pressure. Perhaps this was done because both of those are measurements from an instrument (ie, observed quantities). If it was me, I would place the variable intended to be the vertical coordinate in the MetaData group, and the other one in the ObsValue group to help clarify which one is the vertical coordinate. Perhaps @fcvdb can shed more light on why both height and pressure appear in the MetaData group.

The new IODA to obs_seq converter removes rows from the internal pandas dataframe that have missing values (NaNs) for any of the location, height, timestamp, and observation values. I think this can explain why the obs counts are different between using height vs pressure for the vertical coordinate. My inclination would be to go with the vertical coordinate that yields the most converted obs (ie pressure in this case). @fcvdb would that make sense?

@srherbener
Copy link
Contributor Author

@hkershaw-brown I have removed the partial python test, and updated the documentation. In the pytools.rst I added a section for "Available Tool Packages" and included pyjedi documentation. I was anticipating adding in documentation for pyqceff too, but I was hoping for you to fill that in. I'm okay if you don't like the organization I made on the pytools.rst page, and I'm open to another approach if that's what you want. Thanks!

@hkershaw-brown
Copy link
Member

@hkershaw-brown this is perfect Steve. Sorry I have not got to releasing this yet, various CISL events have been talking up time this week.

@srherbener
Copy link
Contributor Author

@hkershaw-brown no worries. Just wanted to let you know I finished the changes you requested. I'm fine with this PR waiting while you take care of some other things. No need to rush.

@hkershaw-brown hkershaw-brown added the release! bundle with next release label Aug 13, 2025
Copy link
Member

@hkershaw-brown hkershaw-brown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff Steve!

@hkershaw-brown hkershaw-brown merged commit 6684fb9 into NCAR:main Aug 13, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release! bundle with next release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants