Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@jtherrmann
Copy link
Contributor

@jtherrmann jtherrmann commented Nov 3, 2025

Description of proposed changes

Issue with discussion of these changes: #1426

  • Support the new naming convention for HyP3's INSAR_ISCE_MULTI_BURST products
  • Update the add_hyp3_metadata function in src/mintpy/prep_hyp3.py to simplify the code and match the three HyP3 InSAR job types based on the full filename pattern
  • Add pytest unit tests for add_hyp3_metadata

Reminders

  • Fix Support new naming convention for Sentinel-1 burst-based interferograms from HyP3 #1426
  • Pass Pre-commit check (green)
  • Pass Codacy code review (green)
  • Pass Circle CI test (green)
  • Make sure that your code follows our style. Use the other functions/files as a basis.
  • If modifying functionality, describe changes to function behavior and arguments in a comment below the function declaration.
  • If adding new functionality, add a detailed description to the documentation and/or an example.

Summary by Sourcery

Support the new HyP3 INSAR_ISCE_MULTI_BURST naming convention by introducing a parsing helper and refactoring metadata extraction, and add corresponding pytest coverage and CI integration.

New Features:

  • Add parsing and support for HyP3 INSAR_ISCE_MULTI_BURST job type

Enhancements:

  • Refactor add_hyp3_metadata to use a unified helper for parsing three HyP3 InSAR job types (_get_product_name_and_type)

CI:

  • Install pytest and include test run in CircleCI configuration

Tests:

  • Add pytest unit tests for add_hyp3_metadata and _get_product_name_and_type

@welcome
Copy link

welcome bot commented Nov 3, 2025

💖 Thanks for opening this pull request! Please check out our contributing guidelines. 💖
Keep in mind that all new features should be documented. It helps to write the comments next to the code or below your functions describing all arguments, and return types before writing the code. This will help you think about your code design and usually results in better code.

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Nov 3, 2025

Reviewer's Guide

This PR extends support for HyP3's INSAR_ISCE_MULTI_BURST products by refactoring the metadata ingestion pipeline: a new regex-based parser identifies three job types, the add_hyp3_metadata function is simplified and reorganized to handle each type uniformly, and comprehensive pytest tests plus CI updates ensure correctness.

ER diagram for HyP3 product types and metadata files

erDiagram
    PRODUCT ||--o| METADATA_FILE : has
    PRODUCT {
        string product_name
        string job_type
    }
    METADATA_FILE {
        string filename
        dict hyp3_meta
    }
    PRODUCT ||--|{ INSAR_ISCE_BURST : type
    PRODUCT ||--|{ INSAR_ISCE_MULTI_BURST : type
    PRODUCT ||--|{ INSAR_GAMMA : type
    METADATA_FILE ||--o| META_DICT : populates
    META_DICT {
        string key
        string value
    }
Loading

Class diagram for updated HyP3 metadata handling

classDiagram
    class prep_hyp3 {
        +add_hyp3_metadata(fname, meta, is_ifg=True)
        +_get_product_name_and_type(filename)
    }
    prep_hyp3 <|-- add_hyp3_metadata
    prep_hyp3 <|-- _get_product_name_and_type

    class add_hyp3_metadata {
        +Handles three job types:
        +INSAR_ISCE_BURST
        +INSAR_ISCE_MULTI_BURST
        +INSAR_GAMMA
        +Parses metadata file
        +Populates meta dict
    }
    class _get_product_name_and_type {
        +Regex-based filename parsing
        +Returns (product_name, job_type)
    }
Loading

Flow diagram for HyP3 product type detection and metadata extraction

flowchart TD
    A["Input filename"] --> B["_get_product_name_and_type(filename)"]
    B --> C{"Job type"}
    C -->|INSAR_ISCE_BURST| D["Parse burst product metadata"]
    C -->|INSAR_ISCE_MULTI_BURST| E["Parse multi-burst product metadata"]
    C -->|INSAR_GAMMA| F["Parse gamma product metadata"]
    D --> G["Populate meta dict"]
    E --> G
    F --> G
    G --> H["Return updated meta dict"]
Loading

File-Level Changes

Change Details Files
Add regex-based product name and job type parser
  • Introduced _get_product_name_and_type with three patterns
  • Matched INSAR_ISCE_BURST, INSAR_ISCE_MULTI_BURST, INSAR_GAMMA
  • Raised clear error on unmatched filenames
src/mintpy/prep_hyp3.py
Refactor add_hyp3_metadata to use unified job_type logic
  • Replaced manual job_id parsing with product_name and job_type
  • Reorganized metadata extraction branches per job_type
  • Simplified file path resolution and removed redundant code
src/mintpy/prep_hyp3.py
Add pytest unit tests for parser and metadata function
  • Created tests for _get_product_name_and_type edge cases
  • Validated add_hyp3_metadata outputs for three job types
  • Added test fixture and sample metadata files
tests/test_prep_hyp3.py
tests/conftest.py
tests/data/*
Update CI and test requirements to include pytest
  • Installed pytest in .circleci/config.yml
  • Updated tests/requirements.txt to add pytest
  • Added pytest invocation in CI job
.circleci/config.yml
tests/requirements.txt

Assessment against linked issues

Issue Objective Addressed Explanation
#1426 Update prep_hyp3.py to support the new naming convention for Sentinel-1 burst-based interferograms from HyP3, including the new INSAR_ISCE_MULTI_BURST job type.
#1426 Maintain backwards compatibility so that MintPy can process products generated with both the older and newer naming conventions.
#1426 Add or update tests to verify correct parsing and metadata extraction for all supported HyP3 product types and naming conventions.

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • Extract the complex filename regexes in _get_product_name_and_type into named constants (with inline comments) to improve readability and ease future updates.
  • The beam_swath parsing logic for INSAR_ISCE_MULTI_BURST is somewhat brittle—add validation or explanatory comments to ensure it handles all expected naming variants and fails cleanly on unexpected patterns.
  • There are “to be added” placeholders for relative_orbit and first/last_frame in the burst and multi-burst branches—either implement those metadata fields now or raise explicit errors to avoid silent omissions.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Extract the complex filename regexes in _get_product_name_and_type into named constants (with inline comments) to improve readability and ease future updates.
- The beam_swath parsing logic for INSAR_ISCE_MULTI_BURST is somewhat brittle—add validation or explanatory comments to ensure it handles all expected naming variants and fails cleanly on unexpected patterns.
- There are “to be added” placeholders for relative_orbit and first/last_frame in the burst and multi-burst branches—either implement those metadata fields now or raise explicit errors to avoid silent omissions.

## Individual Comments

### Comment 1
<location> `src/mintpy/prep_hyp3.py:73` </location>
<code_context>
-    meta_file = os.path.join(os.path.dirname(fname), f'{job_id}.txt')
+    meta_file = os.path.join(os.path.dirname(fname), f'{product_name}.txt')
     hyp3_meta = {}
     with open(meta_file) as f:
         for line in f:
             key, value = line.strip().replace(' ','').split(':')[:2]
             hyp3_meta[key] = value
-    ref_granule = hyp3_meta['ReferenceGranule']
</code_context>

<issue_to_address>
**suggestion:** Splitting on ':' may fail if the value contains colons.

Use split(':', 1) to ensure only the first colon is used as the delimiter, preventing incorrect parsing when values contain colons.

```suggestion
            key, value = line.strip().replace(' ','').split(':', 1)
```
</issue_to_address>

### Comment 2
<location> `src/mintpy/prep_hyp3.py:148-149` </location>
<code_context>
-            # relative_orbit [to be added]
-            # first/last_frame [to be added]
+        ref_granule = hyp3_meta['ReferenceGranule']
+        assert ref_granule.startswith('S1')

+        abs_orbit = int(hyp3_meta['ReferenceOrbitNumber'])
</code_context>

<issue_to_address>
**suggestion:** Using assert for runtime validation may not be ideal for production code.

Consider raising a ValueError with a clear message if ref_granule does not start with 'S1', to improve error handling and user feedback.

```suggestion
        ref_granule = hyp3_meta['ReferenceGranule']
        if not ref_granule.startswith('S1'):
            raise ValueError(f"ReferenceGranule '{ref_granule}' is not a valid Sentinel-1 granule (should start with 'S1').")
```
</issue_to_address>

### Comment 3
<location> `src/mintpy/prep_hyp3.py:145` </location>
<code_context>
-                raise ValueError('Un-recognized Sentinel-1 satellite from {ref_granule}!')
-
-            # first/last_frame [to be completed]
-            t0, t1 = ref_granule.split('_')[-5:-3]
-            meta['startUTC'] = dt.datetime.strptime(t0, '%Y%m%dT%H%M%S').strftime('%Y-%m-%d %H:%M:%S.%f')
-            meta['stopUTC']  = dt.datetime.strptime(t1, '%Y%m%dT%H%M%S').strftime('%Y-%m-%d %H:%M:%S.%f')
</code_context>

<issue_to_address>
**suggestion:** Assumes ref_granule always has at least 5 underscore-separated tokens.

Accessing indices without checking the token count may cause IndexError if ref_granule is malformed. Please validate the token length before splitting.
</issue_to_address>

### Comment 4
<location> `tests/test_prep_hyp3.py:33-42` </location>
<code_context>
+def test_add_hyp3_metadata_insar_isce_burst(test_data_dir):
</code_context>

<issue_to_address>
**suggestion (testing):** Test for INSAR_ISCE_BURST covers typical metadata extraction, but does not check for missing or malformed metadata files.

Add a test for missing or malformed metadata files to verify error handling in add_hyp3_metadata.
</issue_to_address>

### Comment 5
<location> `tests/test_prep_hyp3.py:83-80` </location>
<code_context>
+def test_add_hyp3_metadata_insar_isce_multi_burst(test_data_dir):
</code_context>

<issue_to_address>
**suggestion (testing):** Test for INSAR_ISCE_MULTI_BURST covers expected metadata, but does not test for edge cases in swath token parsing.

Consider adding a test case with malformed or missing swath tokens to ensure the function handles such scenarios gracefully and provides informative errors.

Suggested implementation:

```python
def test_add_hyp3_metadata_insar_isce_multi_burst(test_data_dir):
    assert add_hyp3_metadata(
        fname=str(test_data_dir / 'S1_044_000000s1n00-093117s2n01-093118s3n01_IW_20250718_20250730_VV_INT80_B4FA_unw_phase.tif'),
        meta={
            'WIDTH': 2314,
            'LENGTH': 718,
            'X_STEP': 80.0,
            'Y_STEP': -80.0,
            'X_FIRST': 660960.0,
            'Y_FIRST': 5950880.0,
        },
    )

def test_add_hyp3_metadata_insar_isce_multi_burst_malformed_swath(test_data_dir):
    # Malformed swath token (missing 's2n01')
    malformed_fname = str(test_data_dir / 'S1_044_000000s1n00-093117_IW_20250718_20250730_VV_INT80_B4FA_unw_phase.tif')
    meta = {
        'WIDTH': 2314,
        'LENGTH': 718,
        'X_STEP': 80.0,
        'Y_STEP': -80.0,
        'X_FIRST': 660960.0,
        'Y_FIRST': 5950880.0,
    }
    try:
        add_hyp3_metadata(fname=malformed_fname, meta=meta)
    except Exception as e:
        assert "swath" in str(e).lower() or "token" in str(e).lower()

def test_add_hyp3_metadata_insar_isce_multi_burst_missing_swath(test_data_dir):
    # Missing swath token entirely
    missing_swath_fname = str(test_data_dir / 'S1_044_000000_IW_20250718_20250730_VV_INT80_B4FA_unw_phase.tif')
    meta = {
        'WIDTH': 2314,
        'LENGTH': 718,
        'X_STEP': 80.0,
        'Y_STEP': -80.0,
        'X_FIRST': 660960.0,
        'Y_FIRST': 5950880.0,
    }
    try:
        add_hyp3_metadata(fname=missing_swath_fname, meta=meta)
    except Exception as e:
        assert "swath" in str(e).lower() or "token" in str(e).lower()

```

If `add_hyp3_metadata` does not currently raise informative errors for malformed or missing swath tokens, you will need to update its implementation to do so. Ensure that the error message includes "swath" or "token" so the test assertions will pass.
</issue_to_address>

### Comment 6
<location> `src/mintpy/prep_hyp3.py:19` </location>
<code_context>

 #########################################################################
+
+def _get_product_name_and_type(filename: str) -> tuple[str, str]:
+    if match := re.match(
+        r'S1_\d{6}_IW[123](_\d{8}){2}_(VV|HH)_INT\d{2}_[0-9A-F]{4}',
</code_context>

<issue_to_address>
**issue (complexity):** Consider refactoring filename parsing into a data-driven table of regex patterns and metadata extraction functions to simplify and unify the logic.

```markdown
Consider moving all your filename‐parsing logic into a small data‐driven table of compiled regex’s + metadata, then writing one loop that

  1. finds the right pattern  
  2. pulls out `date1`, `date2`, `swath`, etc via named groups  
  3. hands over the rest of the per–job_type bits (e.g. swath formatting, date‐formats)  

This collapses your three separate `if/elif/else` blocks into one simple loop and removes most of the ad-hoc `split('_')` calls.

For example, at top of module define:

```python
from datetime import datetime
import re

# each entry: (compiled_regex, job_type, date_fmt, swath_fn)
_PATTERNS = [
    (
        re.compile(r'''
            ^S1_\d{6}_IW
            (?P<swath>[123])
            _(?P<date1>\d{8})
            _(?P<date2>\d{8})
            _(?:VV|HH)_INT\d{2}_[0-9A-F]{4}
        ''', re.VERBOSE),
        'INSAR_ISCE_BURST',
        '%Y%m%d',
        lambda m: m.group('swath'),
    ),
    (
        re.compile(r'''
            ^S1_\d{3}_
            (?P<swaths>(?:\d{6}s1n\d{2}-\d{6}s2n\d{2}-\d{6}s3n\d{2}))
            _IW
            _(?P<date1>\d{8})
            _(?P<date2>\d{8})
            _(?:VV|HH)_INT\d{2}_[0-9A-F]{4}
        ''', re.VERBOSE),
        'INSAR_ISCE_MULTI_BURST',
        '%Y%m%d',
        # join the 7th char of each token, skipping placeholders
        lambda m: ''.join(tok[6] for tok in m.group('swaths').split('-') if not tok.startswith('000000s')),
    ),
    (
        re.compile(r'''
            ^S1[ABC]{2}_
            (?P<date1>\d{8}T\d{6})
            _(?P<date2>\d{8}T\d{6})
            _(?:VV|HH)[PRO]\d{3}_INT\d{2}_G_[uw][ec][123F]_[0-9A-F]{4}
        ''', re.VERBOSE),
        'INSAR_GAMMA',
        '%Y%m%dT%H%M%S',
        lambda m: '123',
    ),
]
```

Then your helper becomes:

```python
def _get_product_info(fname: str):
    base = os.path.basename(fname)
    for regex, job_type, date_fmt, swath_fn in _PATTERNS:
        m = regex.match(base)
        if m:
            return m, job_type, date_fmt, swath_fn
    raise ValueError(f'unable to parse {base}')

def add_hyp3_metadata(fname, meta, is_ifg=True):
    m, job_type, date_fmt, swath_fn = _get_product_info(fname)
    product_name = m.group(0)
    date1 = datetime.strptime(m.group('date1'), date_fmt)
    date2 = datetime.strptime(m.group('date2'), date_fmt)

    # ... read hyp3_meta as before ...

    # universal metadata
    meta.update({
        'PROCESSOR': 'hyp3',
        'CENTER_LINE_UTC': hyp3_meta['UTCtime'],
        # etc.
        'beam_swath': swath_fn(m),
        'unwrap_method': hyp3_meta['Unwrappingtype'],
    })

    if job_type == 'INSAR_GAMMA':
        # relative orbit via lookup
        sat = hyp3_meta['ReferenceGranule'][:3]
        offsets = {'S1A': 73, 'S1B': 202, 'S1C': 172}
        abs_orbit = int(hyp3_meta['ReferenceOrbitNumber'])
        meta['relative_orbit'] = ((abs_orbit - offsets[sat]) % 175) + 1
        t0, t1 = hyp3_meta['ReferenceGranule'].split('_')[-5:-3]
        meta['startUTC'] = datetime.strptime(t0, date_fmt).isoformat(sep=' ')
        meta['stopUTC']  = datetime.strptime(t1, date_fmt).isoformat(sep=' ')
    # for INSAR_ISCE_* bursts, any extra fields can be added similarly

    if is_ifg:
        meta['DATE12'] = f'{date1:%y%m%d}-{date2:%y%m%d}'
        meta['P_BASELINE_TOP_HDR'] = meta['P_BASELINE_BOTTOM_HDR'] = hyp3_meta['Baseline']

    return meta
```

Benefits:
- One loop instead of three regex‐`if`s.
- All formats live in `_PATTERNS` (easy to extend).
- Named groups remove ad-hoc `split('_')` logic.
- Relative‐orbit offsets in one dict.
- Overall much flatter, shorter, easier to maintain.
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@jtherrmann
Copy link
Contributor Author

Codacy is flagging the addition of pytest to tests/requirements.txt as "Statement seems to have no effect". I'm not sure why this error would be applied to a requirements file and not sure how to dismiss it.

@Alex-Lewandowski
Copy link
Contributor

Codacy is flagging the addition of pytest to tests/requirements.txt as "Statement seems to have no effect". I'm not sure why this error would be applied to a requirements file and not sure how to dismiss it.

It seems like Codacy is treating requirements.txt like a Python file and running Pylint on it. Codacy has an Ignored Files List. I'll try ignoring requirements.txt and rerun the check.
image

@Alex-Lewandowski
Copy link
Contributor

Alex-Lewandowski commented Nov 5, 2025

I'll try ignoring requirements.txt and rerun the check.

Actually, it looks like an admin will have to do that.
image

@yunjunz, would you be able to help us change this Codacy configuration?

@yunjunz
Copy link
Member

yunjunz commented Nov 6, 2025

I'll try ignoring requirements.txt and rerun the check.

Actually, it looks like an admin will have to do that. image

@yunjunz, would you be able to help us change this Codacy configuration?

Done.

@Alex-Lewandowski
Copy link
Contributor

Done.

Thank you!

@asjohnston-asf
Copy link
Contributor

Here are three test data sets I've used to validate these changes with each of the three now-supported naming conventions. They're all SBAS stacks of 58 nearest-neighbor interferograms over Mt Edgecumbe, AK from 2018-2019. Outputs are available until Nov 14 if anyone else wants to verify our work.

Copy link
Contributor

@Alex-Lewandowski Alex-Lewandowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prep_hyp3 and unit test-related updates all look good to me!

I tried the three jobs @asjohnston-asf posted, and they all loaded and completed a run of the smallBaselineApp without any issues. They are all nearest-neighbor stacks, which shouldn't impact loading, but also doesn't provide the network redundancy expected for an SBAS time series. This causes the quick_overview step to skip producing numTriNonzeroIntAmbiguity.h5.

It probably wasn't necessary, but as an additional sanity check, I also ran a more redundant multi-burst SBAS stack. That also succeeded without issue and generated the numTriNonzeroIntAmbiguity.h5.

@Alex-Lewandowski
Copy link
Contributor

Hi @yunjunz, These updates are well covered by new unit tests, all our GH actions are green, and we also performed some informal integration tests on the three supported HyP3 InSAR product types. I think everything is in order, so I will merge to main. Please reach out if you run into any issues releasing.

@Alex-Lewandowski Alex-Lewandowski merged commit d3a0fda into insarlab:main Nov 7, 2025
7 checks passed
@welcome
Copy link

welcome bot commented Nov 7, 2025

🎉 🎉 🎉 Congrats on merging your first pull request! We here at behaviorbot are proud of you! 🎉 🎉 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support new naming convention for Sentinel-1 burst-based interferograms from HyP3

4 participants