Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Fix lineage records for stub-run and eval outputs#7211

Merged
bentsherman merged 6 commits into
masterfrom
fix-lineage-stubrun-eval
Jun 16, 2026
Merged

Fix lineage records for stub-run and eval outputs#7211
bentsherman merged 6 commits into
masterfrom
fix-lineage-stubrun-eval

Conversation

@jorgee

@jorgee jorgee commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Problem

Two gaps break lineage tracking, whose goal is to capture everything that distinguishes one task execution from another (ideally everything that feeds the task hash) so differences can be diffed from the lineage records.

  1. Stub-run produces no task lineage records. Running with -stub-run writes only the WorkflowRun record; every TaskRun, TaskOutput and FileOutput record is silently lost.

  2. Output eval commands are absent from the lineage record. Output eval(...) commands contribute to the task hash, but the command strings are not stored in the TaskRun lineage record. Two executions that differ only in an eval command get different hashes but indistinguishable lineage records.

Root cause

Stub-run

LinObserver.onTaskComplete does fire for stub tasks, but storeTaskRun throws and the exception is swallowed at debug level by Session.notifyEvent:

groovy.lang.MissingPropertyException: No such property: source for class: nextflow.script.dsl.ProcessDslV1
    at nextflow.processor.TaskRun.getStubSource(TaskRun.groovy:1023)
    at nextflow.lineage.LinObserver.storeTaskRun(LinObserver.groovy:258)

storeTaskRun computes the code checksum from task.stubSource when session.stubRun is set. getStubSource() accessed config?.getStubBlock()?.source. getStubBlock() returns a TaskClosure (a Groovy Closure); accessing .source as a property routes through Closure.getProperty, which resolves the name against the closure's delegate — the ProcessDslV1 script object — which has no source property. resolveStub is unaffected because it calls block.getSource() as a method. Because WorkflowRun is written in onFlowBegin (before any task completes), it survives while everything from onTaskComplete is lost.

Eval outputs

A normal run already records the eval result value in TaskOutput ({"type":"eval","name":"nxf_out_eval_1","value":"hello"}), but the eval command (echo hello) — the part that feeds the task hash via TaskHasher.computeEvalOutputCommands — is not stored anywhere in the TaskRun record. Moreover the hash keys on both the eval output name and its command, so the name must be recorded too: otherwise two executions that differ only by eval output name get different hashes but indistinguishable lineage records.

Solution

  • TaskRun.getStubSource(): call getStubBlock()?.getSource() (method) instead of ?.source (property).
  • Lineage TaskRun model: add a Map<String,String> eval field (output name → command, in output-declaration order) populated in LinObserver.storeTaskRun from task.getOutputEvals(); null when the task has no eval outputs. Both the name and the command feed the task hash (TaskHasher.computeEvalOutputCommands), so both are recorded. Declaration order is preserved, which is deterministic across executions of the same script, so eval records remain directly comparable.

Backward compatibility

The lineage models are (de)serialized by Gson via name-based field reflection, not by positional constructor. Records written before this change have no eval field and deserialize with eval == null; new records add it. The field is appended last, matching the convention for additive fields (e.g. labels in TaskOutput/FileOutput). v1beta1 is a beta model; an additive optional field does not warrant a version bump.

Tests

  • TaskRunTest: getStubSource() regression (via a real TaskClosure whose owner has no source property — verified to fail on the old .source and pass on .getSource()) plus a null-stub-block case.
  • LinObserverTest: eval output names and commands are recorded as a map; null when there are no evals.
  • tests/lineage-stub-eval.nf + .checks: end-to-end under both -stub-run and normal mode, asserting the TaskRun record exists, carries "eval":{"nxf_out_eval_N":"echo hello"}, and records the correct source code (stub vs. main script).

🤖 Generated with Claude Code

Stub-run produced no task lineage records, and output eval commands were
missing from the records although they contribute to the task hash.

- stub-run: TaskRun.getStubSource() accessed `.source` as a property on a
  TaskClosure. Property access on a Closure resolves against its delegate
  (the ProcessDslV1 script object), which has no `source` property, throwing
  MissingPropertyException. The exception was swallowed at debug level by
  Session.notifyEvent, silently dropping every TaskRun/TaskOutput/FileOutput
  record (only the WorkflowRun, written before any task completes, survived).
  Call getSource() as a method instead, as resolveStub already does.

- eval outputs: add an `eval` field to the lineage TaskRun model holding the
  output eval commands, populated from task.getOutputEvals(). These commands
  feed the task hash but were absent from the lineage record, so executions
  differing only in an eval command produced indistinguishable records.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
Signed-off-by: jorgee <[email protected]>
@jorgee jorgee requested review from bentsherman and pditommaso and removed request for bentsherman June 9, 2026 14:48
The task hash keys on both the eval output name and its command
(TaskHasher.computeEvalOutputCommands), but the lineage TaskRun record
stored only the command values, dropping the names. Renaming an eval
output then changed the task hash while leaving the lineage record
identical.

Change the lineage TaskRun `eval` field from List<String> to
Map<String,String> (name -> command), preserving declaration order so
records are comparable across executions of the same script.

Signed-off-by: Jorge Ejarque <[email protected]>

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
Signed-off-by: jorgee <[email protected]>
@netlify

netlify Bot commented Jun 9, 2026

Copy link
Copy Markdown

Deploy Preview for nextflow-docs-staging canceled.

Name Link
🔨 Latest commit b363850
🔍 Latest deploy log https://app.netlify.com/projects/nextflow-docs-staging/deploys/6a319c484af0510008547c8f

@jorgee

jorgee commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

This PR partially fixes #7202

@bentsherman bentsherman left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, just a few minor change requests

Comment thread modules/nf-lineage/src/main/nextflow/lineage/model/v1beta1/TaskRun.groovy Outdated
Comment thread modules/nf-lineage/src/test/nextflow/lineage/LinObserverTest.groovy Outdated
Comment thread tests/checks/lineage-stub-eval.nf/.checks Outdated
jorgee and others added 2 commits June 16, 2026 19:10
- Move the lineage TaskRun `eval` field after `script` and update the
  positional constructor accordingly.
- Remove the dedicated null-eval test; the minimal-task test already
  covers eval == null.
- Drop the lineage-stub-eval e2e test; covered by TaskRunTest and
  LinObserverTest unit tests.

Signed-off-by: jorgee <[email protected]>
@jorgee jorgee requested a review from bentsherman June 16, 2026 17:16
@bentsherman bentsherman merged commit 52d62a7 into master Jun 16, 2026
25 checks passed
@bentsherman bentsherman deleted the fix-lineage-stubrun-eval branch June 16, 2026 19:40
Comment on lines +47 to +51
/**
* Output eval commands executed by the task run, mapped by output name.
* Both the name and the command feed the task hash, so both are recorded.
*/
Map<String,String> eval

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorgee Please make a PR to update the lineage schema as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants