Fix lineage records for stub-run and eval outputs#7211
Merged
Conversation
Stub-run produced no task lineage records, and output eval commands were missing from the records although they contribute to the task hash. - stub-run: TaskRun.getStubSource() accessed `.source` as a property on a TaskClosure. Property access on a Closure resolves against its delegate (the ProcessDslV1 script object), which has no `source` property, throwing MissingPropertyException. The exception was swallowed at debug level by Session.notifyEvent, silently dropping every TaskRun/TaskOutput/FileOutput record (only the WorkflowRun, written before any task completes, survived). Call getSource() as a method instead, as resolveStub already does. - eval outputs: add an `eval` field to the lineage TaskRun model holding the output eval commands, populated from task.getOutputEvals(). These commands feed the task hash but were absent from the lineage record, so executions differing only in an eval command produced indistinguishable records. Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]> Signed-off-by: jorgee <[email protected]>
The task hash keys on both the eval output name and its command (TaskHasher.computeEvalOutputCommands), but the lineage TaskRun record stored only the command values, dropping the names. Renaming an eval output then changed the task hash while leaving the lineage record identical. Change the lineage TaskRun `eval` field from List<String> to Map<String,String> (name -> command), preserving declaration order so records are comparable across executions of the same script. Signed-off-by: Jorge Ejarque <[email protected]> Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]> Signed-off-by: jorgee <[email protected]>
✅ Deploy Preview for nextflow-docs-staging canceled.
|
Contributor
Author
|
This PR partially fixes #7202 |
bentsherman
requested changes
Jun 12, 2026
bentsherman
left a comment
Member
There was a problem hiding this comment.
Looks good overall, just a few minor change requests
- Move the lineage TaskRun `eval` field after `script` and update the positional constructor accordingly. - Remove the dedicated null-eval test; the minimal-task test already covers eval == null. - Drop the lineage-stub-eval e2e test; covered by TaskRunTest and LinObserverTest unit tests. Signed-off-by: jorgee <[email protected]>
Signed-off-by: Jorge Ejarque <[email protected]>
Signed-off-by: jorgee <[email protected]>
Signed-off-by: jorgee <[email protected]>
bentsherman
approved these changes
Jun 16, 2026
bentsherman
reviewed
Jun 16, 2026
Comment on lines
+47
to
+51
| /** | ||
| * Output eval commands executed by the task run, mapped by output name. | ||
| * Both the name and the command feed the task hash, so both are recorded. | ||
| */ | ||
| Map<String,String> eval |
Member
There was a problem hiding this comment.
@jorgee Please make a PR to update the lineage schema as well
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Two gaps break lineage tracking, whose goal is to capture everything that distinguishes one task execution from another (ideally everything that feeds the task hash) so differences can be diffed from the lineage records.
Stub-run produces no task lineage records. Running with
-stub-runwrites only theWorkflowRunrecord; everyTaskRun,TaskOutputandFileOutputrecord is silently lost.Output eval commands are absent from the lineage record. Output
eval(...)commands contribute to the task hash, but the command strings are not stored in theTaskRunlineage record. Two executions that differ only in an eval command get different hashes but indistinguishable lineage records.Root cause
Stub-run
LinObserver.onTaskCompletedoes fire for stub tasks, butstoreTaskRunthrows and the exception is swallowed at debug level bySession.notifyEvent:storeTaskRuncomputes the code checksum fromtask.stubSourcewhensession.stubRunis set.getStubSource()accessedconfig?.getStubBlock()?.source.getStubBlock()returns aTaskClosure(a GroovyClosure); accessing.sourceas a property routes throughClosure.getProperty, which resolves the name against the closure's delegate — theProcessDslV1script object — which has nosourceproperty.resolveStubis unaffected because it callsblock.getSource()as a method. BecauseWorkflowRunis written inonFlowBegin(before any task completes), it survives while everything fromonTaskCompleteis lost.Eval outputs
A normal run already records the eval result value in
TaskOutput({"type":"eval","name":"nxf_out_eval_1","value":"hello"}), but the eval command (echo hello) — the part that feeds the task hash viaTaskHasher.computeEvalOutputCommands— is not stored anywhere in theTaskRunrecord. Moreover the hash keys on both the eval output name and its command, so the name must be recorded too: otherwise two executions that differ only by eval output name get different hashes but indistinguishable lineage records.Solution
TaskRun.getStubSource(): callgetStubBlock()?.getSource()(method) instead of?.source(property).TaskRunmodel: add aMap<String,String> evalfield (output name → command, in output-declaration order) populated inLinObserver.storeTaskRunfromtask.getOutputEvals();nullwhen the task has no eval outputs. Both the name and the command feed the task hash (TaskHasher.computeEvalOutputCommands), so both are recorded. Declaration order is preserved, which is deterministic across executions of the same script, so eval records remain directly comparable.Backward compatibility
The lineage models are (de)serialized by Gson via name-based field reflection, not by positional constructor. Records written before this change have no
evalfield and deserialize witheval == null; new records add it. The field is appended last, matching the convention for additive fields (e.g.labelsinTaskOutput/FileOutput).v1beta1is a beta model; an additive optional field does not warrant a version bump.Tests
TaskRunTest:getStubSource()regression (via a realTaskClosurewhose owner has nosourceproperty — verified to fail on the old.sourceand pass on.getSource()) plus a null-stub-block case.LinObserverTest: eval output names and commands are recorded as a map;nullwhen there are no evals.tests/lineage-stub-eval.nf+.checks: end-to-end under both-stub-runand normal mode, asserting theTaskRunrecord exists, carries"eval":{"nxf_out_eval_N":"echo hello"}, and records the correct source code (stub vs. main script).🤖 Generated with Claude Code