-
Notifications
You must be signed in to change notification settings - Fork 297
Support HyP3's INSAR_ISCE_MULTI_BURST job type #1430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
naming convention
add pytest to circleci
support three hyp3 job types for insar
|
💖 Thanks for opening this pull request! Please check out our contributing guidelines. 💖 |
Reviewer's GuideThis PR extends support for HyP3's INSAR_ISCE_MULTI_BURST products by refactoring the metadata ingestion pipeline: a new regex-based parser identifies three job types, the add_hyp3_metadata function is simplified and reorganized to handle each type uniformly, and comprehensive pytest tests plus CI updates ensure correctness. ER diagram for HyP3 product types and metadata fileserDiagram
PRODUCT ||--o| METADATA_FILE : has
PRODUCT {
string product_name
string job_type
}
METADATA_FILE {
string filename
dict hyp3_meta
}
PRODUCT ||--|{ INSAR_ISCE_BURST : type
PRODUCT ||--|{ INSAR_ISCE_MULTI_BURST : type
PRODUCT ||--|{ INSAR_GAMMA : type
METADATA_FILE ||--o| META_DICT : populates
META_DICT {
string key
string value
}
Class diagram for updated HyP3 metadata handlingclassDiagram
class prep_hyp3 {
+add_hyp3_metadata(fname, meta, is_ifg=True)
+_get_product_name_and_type(filename)
}
prep_hyp3 <|-- add_hyp3_metadata
prep_hyp3 <|-- _get_product_name_and_type
class add_hyp3_metadata {
+Handles three job types:
+INSAR_ISCE_BURST
+INSAR_ISCE_MULTI_BURST
+INSAR_GAMMA
+Parses metadata file
+Populates meta dict
}
class _get_product_name_and_type {
+Regex-based filename parsing
+Returns (product_name, job_type)
}
Flow diagram for HyP3 product type detection and metadata extractionflowchart TD
A["Input filename"] --> B["_get_product_name_and_type(filename)"]
B --> C{"Job type"}
C -->|INSAR_ISCE_BURST| D["Parse burst product metadata"]
C -->|INSAR_ISCE_MULTI_BURST| E["Parse multi-burst product metadata"]
C -->|INSAR_GAMMA| F["Parse gamma product metadata"]
D --> G["Populate meta dict"]
E --> G
F --> G
G --> H["Return updated meta dict"]
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- Extract the complex filename regexes in _get_product_name_and_type into named constants (with inline comments) to improve readability and ease future updates.
- The beam_swath parsing logic for INSAR_ISCE_MULTI_BURST is somewhat brittle—add validation or explanatory comments to ensure it handles all expected naming variants and fails cleanly on unexpected patterns.
- There are “to be added” placeholders for relative_orbit and first/last_frame in the burst and multi-burst branches—either implement those metadata fields now or raise explicit errors to avoid silent omissions.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Extract the complex filename regexes in _get_product_name_and_type into named constants (with inline comments) to improve readability and ease future updates.
- The beam_swath parsing logic for INSAR_ISCE_MULTI_BURST is somewhat brittle—add validation or explanatory comments to ensure it handles all expected naming variants and fails cleanly on unexpected patterns.
- There are “to be added” placeholders for relative_orbit and first/last_frame in the burst and multi-burst branches—either implement those metadata fields now or raise explicit errors to avoid silent omissions.
## Individual Comments
### Comment 1
<location> `src/mintpy/prep_hyp3.py:73` </location>
<code_context>
- meta_file = os.path.join(os.path.dirname(fname), f'{job_id}.txt')
+ meta_file = os.path.join(os.path.dirname(fname), f'{product_name}.txt')
hyp3_meta = {}
with open(meta_file) as f:
for line in f:
key, value = line.strip().replace(' ','').split(':')[:2]
hyp3_meta[key] = value
- ref_granule = hyp3_meta['ReferenceGranule']
</code_context>
<issue_to_address>
**suggestion:** Splitting on ':' may fail if the value contains colons.
Use split(':', 1) to ensure only the first colon is used as the delimiter, preventing incorrect parsing when values contain colons.
```suggestion
key, value = line.strip().replace(' ','').split(':', 1)
```
</issue_to_address>
### Comment 2
<location> `src/mintpy/prep_hyp3.py:148-149` </location>
<code_context>
- # relative_orbit [to be added]
- # first/last_frame [to be added]
+ ref_granule = hyp3_meta['ReferenceGranule']
+ assert ref_granule.startswith('S1')
+ abs_orbit = int(hyp3_meta['ReferenceOrbitNumber'])
</code_context>
<issue_to_address>
**suggestion:** Using assert for runtime validation may not be ideal for production code.
Consider raising a ValueError with a clear message if ref_granule does not start with 'S1', to improve error handling and user feedback.
```suggestion
ref_granule = hyp3_meta['ReferenceGranule']
if not ref_granule.startswith('S1'):
raise ValueError(f"ReferenceGranule '{ref_granule}' is not a valid Sentinel-1 granule (should start with 'S1').")
```
</issue_to_address>
### Comment 3
<location> `src/mintpy/prep_hyp3.py:145` </location>
<code_context>
- raise ValueError('Un-recognized Sentinel-1 satellite from {ref_granule}!')
-
- # first/last_frame [to be completed]
- t0, t1 = ref_granule.split('_')[-5:-3]
- meta['startUTC'] = dt.datetime.strptime(t0, '%Y%m%dT%H%M%S').strftime('%Y-%m-%d %H:%M:%S.%f')
- meta['stopUTC'] = dt.datetime.strptime(t1, '%Y%m%dT%H%M%S').strftime('%Y-%m-%d %H:%M:%S.%f')
</code_context>
<issue_to_address>
**suggestion:** Assumes ref_granule always has at least 5 underscore-separated tokens.
Accessing indices without checking the token count may cause IndexError if ref_granule is malformed. Please validate the token length before splitting.
</issue_to_address>
### Comment 4
<location> `tests/test_prep_hyp3.py:33-42` </location>
<code_context>
+def test_add_hyp3_metadata_insar_isce_burst(test_data_dir):
</code_context>
<issue_to_address>
**suggestion (testing):** Test for INSAR_ISCE_BURST covers typical metadata extraction, but does not check for missing or malformed metadata files.
Add a test for missing or malformed metadata files to verify error handling in add_hyp3_metadata.
</issue_to_address>
### Comment 5
<location> `tests/test_prep_hyp3.py:83-80` </location>
<code_context>
+def test_add_hyp3_metadata_insar_isce_multi_burst(test_data_dir):
</code_context>
<issue_to_address>
**suggestion (testing):** Test for INSAR_ISCE_MULTI_BURST covers expected metadata, but does not test for edge cases in swath token parsing.
Consider adding a test case with malformed or missing swath tokens to ensure the function handles such scenarios gracefully and provides informative errors.
Suggested implementation:
```python
def test_add_hyp3_metadata_insar_isce_multi_burst(test_data_dir):
assert add_hyp3_metadata(
fname=str(test_data_dir / 'S1_044_000000s1n00-093117s2n01-093118s3n01_IW_20250718_20250730_VV_INT80_B4FA_unw_phase.tif'),
meta={
'WIDTH': 2314,
'LENGTH': 718,
'X_STEP': 80.0,
'Y_STEP': -80.0,
'X_FIRST': 660960.0,
'Y_FIRST': 5950880.0,
},
)
def test_add_hyp3_metadata_insar_isce_multi_burst_malformed_swath(test_data_dir):
# Malformed swath token (missing 's2n01')
malformed_fname = str(test_data_dir / 'S1_044_000000s1n00-093117_IW_20250718_20250730_VV_INT80_B4FA_unw_phase.tif')
meta = {
'WIDTH': 2314,
'LENGTH': 718,
'X_STEP': 80.0,
'Y_STEP': -80.0,
'X_FIRST': 660960.0,
'Y_FIRST': 5950880.0,
}
try:
add_hyp3_metadata(fname=malformed_fname, meta=meta)
except Exception as e:
assert "swath" in str(e).lower() or "token" in str(e).lower()
def test_add_hyp3_metadata_insar_isce_multi_burst_missing_swath(test_data_dir):
# Missing swath token entirely
missing_swath_fname = str(test_data_dir / 'S1_044_000000_IW_20250718_20250730_VV_INT80_B4FA_unw_phase.tif')
meta = {
'WIDTH': 2314,
'LENGTH': 718,
'X_STEP': 80.0,
'Y_STEP': -80.0,
'X_FIRST': 660960.0,
'Y_FIRST': 5950880.0,
}
try:
add_hyp3_metadata(fname=missing_swath_fname, meta=meta)
except Exception as e:
assert "swath" in str(e).lower() or "token" in str(e).lower()
```
If `add_hyp3_metadata` does not currently raise informative errors for malformed or missing swath tokens, you will need to update its implementation to do so. Ensure that the error message includes "swath" or "token" so the test assertions will pass.
</issue_to_address>
### Comment 6
<location> `src/mintpy/prep_hyp3.py:19` </location>
<code_context>
#########################################################################
+
+def _get_product_name_and_type(filename: str) -> tuple[str, str]:
+ if match := re.match(
+ r'S1_\d{6}_IW[123](_\d{8}){2}_(VV|HH)_INT\d{2}_[0-9A-F]{4}',
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring filename parsing into a data-driven table of regex patterns and metadata extraction functions to simplify and unify the logic.
```markdown
Consider moving all your filename‐parsing logic into a small data‐driven table of compiled regex’s + metadata, then writing one loop that
1. finds the right pattern
2. pulls out `date1`, `date2`, `swath`, etc via named groups
3. hands over the rest of the per–job_type bits (e.g. swath formatting, date‐formats)
This collapses your three separate `if/elif/else` blocks into one simple loop and removes most of the ad-hoc `split('_')` calls.
For example, at top of module define:
```python
from datetime import datetime
import re
# each entry: (compiled_regex, job_type, date_fmt, swath_fn)
_PATTERNS = [
(
re.compile(r'''
^S1_\d{6}_IW
(?P<swath>[123])
_(?P<date1>\d{8})
_(?P<date2>\d{8})
_(?:VV|HH)_INT\d{2}_[0-9A-F]{4}
''', re.VERBOSE),
'INSAR_ISCE_BURST',
'%Y%m%d',
lambda m: m.group('swath'),
),
(
re.compile(r'''
^S1_\d{3}_
(?P<swaths>(?:\d{6}s1n\d{2}-\d{6}s2n\d{2}-\d{6}s3n\d{2}))
_IW
_(?P<date1>\d{8})
_(?P<date2>\d{8})
_(?:VV|HH)_INT\d{2}_[0-9A-F]{4}
''', re.VERBOSE),
'INSAR_ISCE_MULTI_BURST',
'%Y%m%d',
# join the 7th char of each token, skipping placeholders
lambda m: ''.join(tok[6] for tok in m.group('swaths').split('-') if not tok.startswith('000000s')),
),
(
re.compile(r'''
^S1[ABC]{2}_
(?P<date1>\d{8}T\d{6})
_(?P<date2>\d{8}T\d{6})
_(?:VV|HH)[PRO]\d{3}_INT\d{2}_G_[uw][ec][123F]_[0-9A-F]{4}
''', re.VERBOSE),
'INSAR_GAMMA',
'%Y%m%dT%H%M%S',
lambda m: '123',
),
]
```
Then your helper becomes:
```python
def _get_product_info(fname: str):
base = os.path.basename(fname)
for regex, job_type, date_fmt, swath_fn in _PATTERNS:
m = regex.match(base)
if m:
return m, job_type, date_fmt, swath_fn
raise ValueError(f'unable to parse {base}')
def add_hyp3_metadata(fname, meta, is_ifg=True):
m, job_type, date_fmt, swath_fn = _get_product_info(fname)
product_name = m.group(0)
date1 = datetime.strptime(m.group('date1'), date_fmt)
date2 = datetime.strptime(m.group('date2'), date_fmt)
# ... read hyp3_meta as before ...
# universal metadata
meta.update({
'PROCESSOR': 'hyp3',
'CENTER_LINE_UTC': hyp3_meta['UTCtime'],
# etc.
'beam_swath': swath_fn(m),
'unwrap_method': hyp3_meta['Unwrappingtype'],
})
if job_type == 'INSAR_GAMMA':
# relative orbit via lookup
sat = hyp3_meta['ReferenceGranule'][:3]
offsets = {'S1A': 73, 'S1B': 202, 'S1C': 172}
abs_orbit = int(hyp3_meta['ReferenceOrbitNumber'])
meta['relative_orbit'] = ((abs_orbit - offsets[sat]) % 175) + 1
t0, t1 = hyp3_meta['ReferenceGranule'].split('_')[-5:-3]
meta['startUTC'] = datetime.strptime(t0, date_fmt).isoformat(sep=' ')
meta['stopUTC'] = datetime.strptime(t1, date_fmt).isoformat(sep=' ')
# for INSAR_ISCE_* bursts, any extra fields can be added similarly
if is_ifg:
meta['DATE12'] = f'{date1:%y%m%d}-{date2:%y%m%d}'
meta['P_BASELINE_TOP_HDR'] = meta['P_BASELINE_BOTTOM_HDR'] = hyp3_meta['Baseline']
return meta
```
Benefits:
- One loop instead of three regex‐`if`s.
- All formats live in `_PATTERNS` (easy to extend).
- Named groups remove ad-hoc `split('_')` logic.
- Relative‐orbit offsets in one dict.
- Overall much flatter, shorter, easier to maintain.
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
Codacy is flagging the addition of |
Actually, it looks like an admin will have to do that. @yunjunz, would you be able to help us change this Codacy configuration? |
Done. |
Thank you! |
|
Here are three test data sets I've used to validate these changes with each of the three now-supported naming conventions. They're all SBAS stacks of 58 nearest-neighbor interferograms over Mt Edgecumbe, AK from 2018-2019. Outputs are available until Nov 14 if anyone else wants to verify our work. |
Alex-Lewandowski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The prep_hyp3 and unit test-related updates all look good to me!
I tried the three jobs @asjohnston-asf posted, and they all loaded and completed a run of the smallBaselineApp without any issues. They are all nearest-neighbor stacks, which shouldn't impact loading, but also doesn't provide the network redundancy expected for an SBAS time series. This causes the quick_overview step to skip producing numTriNonzeroIntAmbiguity.h5.
It probably wasn't necessary, but as an additional sanity check, I also ran a more redundant multi-burst SBAS stack. That also succeeded without issue and generated the numTriNonzeroIntAmbiguity.h5.
|
Hi @yunjunz, These updates are well covered by new unit tests, all our GH actions are green, and we also performed some informal integration tests on the three supported HyP3 InSAR product types. I think everything is in order, so I will merge to main. Please reach out if you run into any issues releasing. |
|
🎉 🎉 🎉 Congrats on merging your first pull request! We here at behaviorbot are proud of you! 🎉 🎉 🎉 |



Description of proposed changes
Issue with discussion of these changes: #1426
add_hyp3_metadatafunction insrc/mintpy/prep_hyp3.pyto simplify the code and match the three HyP3 InSAR job types based on the full filename patternadd_hyp3_metadataReminders
Summary by Sourcery
Support the new HyP3 INSAR_ISCE_MULTI_BURST naming convention by introducing a parsing helper and refactoring metadata extraction, and add corresponding pytest coverage and CI integration.
New Features:
Enhancements:
CI:
Tests: