Add module directive to lineage TaskRun record#7203
Conversation
✅ Deploy Preview for nextflow-docs-staging ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
846afee to
07fee3e
Compare
The task cache hash keys on environment module directives, but the lineage TaskRun record had no field for them, so comparing two task runs could not surface a module change that invalidated the cache. Record the module directive alongside the other environment fields (conda, spack, container, architecture) so the lineage record reflects this cache-hash input. Signed-off-by: Jonathan Manning <[email protected]> Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
07fee3e to
de22609
Compare
|
Uses same name as Nextflow modules in #7160 |
Co-authored-by: Jorge Ejarque <[email protected]> Signed-off-by: Jonathan Manning <[email protected]>
|
Let's keep |
|
Is this PR not duplicate of #7160? |
|
No, this refers to environment modules, and #7160 is about adding the Nextflow module id in the lineage record. |
|
Umm, not sure it should be included. Lineage is not a replacement for canonical task info |
|
@pditommaso actually we did say that the TaskRun record should contain all of the task hash components so that By that logic we should also include environment modules... even though I would hope they were extinct by now... |
Closes part of #7202.
The task cache hash keys on environment
moduledirectives (TaskHasheraddstask.config.getModule()to the hash), but the lineageTaskRunrecord had no field for them. As a result, comparing two task runs withnextflow lineage diffcould not surface amodulechange that invalidated the cache, even though the record already captures the other environment inputs (conda,spack,container,architecture).This adds a
modulefield to theTaskRunv1beta1 model and populates it inLinObserver, alongside the existing environment fields.Example
Changing
samtools/1.9tosamtools/1.17and re-running with-resumecorrectly re-runs the task. Before this change,lineage diffof the two task runs showed only the parent run id changing; now it shows themodulefield changing.Scope
This covers the
moduledirective only. Theevaloutput commands and the stub-run flag (the other cache-hash inputs missing from the record, per #7202) are left out: representingevalcommands needs a design decision, and the stub flag is partially covered already viacodeChecksumswitching to the stub source. Those remain tracked in #7202.This makes the record consistent with that cache-hash input; it does not make
lineage diffa full substitute for-dump-hashes, since the record stores resolved values rather than the ordered hash-key list andDataPathchecksums always use the default hash mode.Tests
LinEncoderTestnow round-trips a populatedmodulefield.LinObserverTest,LinCommandImplTestupdated for the new field.LinTypeAdapterFactoryTestconfirms records without amodulefield still decode (backward compatible).