-
-
Notifications
You must be signed in to change notification settings - Fork 821
fix(engine): prevent race condition that prevents triggerAndWait runs from resuming by atomically creating associated waitpoint records #2519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… from resuming by atomically creating associated waitpoint records
|
Walkthrough
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal). Please share your feedback with us on this Discord post. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
internal-packages/run-engine/src/engine/index.ts (1)
484-498
: Disambiguate P2002 to only map idempotency-key violationsP2002 can arise from other unique constraints (e.g., id/friendlyId). Check the target before throwing RunDuplicateIdempotencyKeyError.
Apply this diff:
- if (error.code === "P2002") { + if (error.code === "P2002") { + const target = (error.meta as any)?.target as string[] | undefined; + const isIdempotencyKeyViolation = + Array.isArray(target) && target.includes("idempotencyKey"); + if (isIdempotencyKeyViolation) { + this.logger.debug("engine.trigger(): throwing RunDuplicateIdempotencyKeyError", { + code: error.code, + message: error.message, + meta: error.meta, + idempotencyKey, + environmentId: environment.id, + }); + + throw new RunDuplicateIdempotencyKeyError( + `Run with idempotency key ${idempotencyKey} already exists` + ); + } + // fall through to rethrow for other unique constraints + }apps/webapp/test/engine/triggerTask.test.ts (2)
41-41
: Use subpath import for core utils in webappPer webapp guideline, avoid importing @trigger.dev/core root.
Apply this diff:
-import { promiseWithResolvers } from "@trigger.dev/core"; +import { promiseWithResolvers } from "@trigger.dev/core/utils";
115-136
: Minor: acknowledge unused parameter to satisfy lintersYou’re ignoring racepoint on purpose. Use an underscore binding to avoid “unused” warnings.
Apply this diff:
- async waitForRacepoint({ id }: { racepoint: TriggerRacepoints; id: string }): Promise<void> { + async waitForRacepoint({ + racepoint: _racepoint, + id, + }: { + racepoint: TriggerRacepoints; + id: string; + }): Promise<void> {apps/webapp/app/runEngine/services/triggerTask.server.ts (2)
47-51
: No‑op racepoint system is fine; consider a shared singleton.
Avoid per‑instance allocation and make intent explicit.Apply this diff here, and then use it in the constructor (see separate comment):
class NoopTriggerRacepointSystem implements TriggerRacepointSystem { async waitForRacepoint(options: { racepoint: TriggerRacepoints; id: string }): Promise<void> { return; } } + +const NOOP_TRIGGER_RACEPOINT_SYSTEM = new NoopTriggerRacepointSystem();
77-90
: Defaulting to no‑op preserves backward compatibility.
Constructor signature change is safe; DI remains optional.Use the shared singleton for the default.
Minor allocation tidy‑up.Apply this diff:
- this.triggerRacepointSystem = opts.triggerRacepointSystem ?? new NoopTriggerRacepointSystem(); + this.triggerRacepointSystem = opts.triggerRacepointSystem ?? NOOP_TRIGGER_RACEPOINT_SYSTEM;
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
apps/webapp/app/runEngine/services/triggerTask.server.ts
(5 hunks)apps/webapp/app/runEngine/types.ts
(1 hunks)apps/webapp/test/engine/triggerTask.test.ts
(4 hunks)internal-packages/run-engine/src/engine/index.ts
(3 hunks)internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
(1 hunks)internal-packages/run-engine/src/engine/tests/triggerAndWait.test.ts
(2 hunks)
🧰 Additional context used
📓 Path-based instructions (8)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}
: Always prefer using isomorphic code like fetch, ReadableStream, etc. instead of Node.js specific code
For TypeScript, we usually use types over interfaces
Avoid enums
No default exports, use function declarations
Files:
apps/webapp/app/runEngine/types.ts
internal-packages/run-engine/src/engine/index.ts
apps/webapp/test/engine/triggerTask.test.ts
internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
internal-packages/run-engine/src/engine/tests/triggerAndWait.test.ts
apps/webapp/app/runEngine/services/triggerTask.server.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
We use zod a lot in packages/core and in the webapp
Files:
apps/webapp/app/runEngine/types.ts
apps/webapp/test/engine/triggerTask.test.ts
apps/webapp/app/runEngine/services/triggerTask.server.ts
apps/webapp/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
When importing from @trigger.dev/core in the webapp, never import the root package path; always use one of the documented subpath exports from @trigger.dev/core’s package.json
Files:
apps/webapp/app/runEngine/types.ts
apps/webapp/test/engine/triggerTask.test.ts
apps/webapp/app/runEngine/services/triggerTask.server.ts
apps/webapp/app/**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
Modules intended for test consumption under apps/webapp/app/**/*.ts must not read environment variables; accept configuration via options instead
Files:
apps/webapp/app/runEngine/types.ts
apps/webapp/app/runEngine/services/triggerTask.server.ts
**/*.test.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Our tests are all vitest
Files:
apps/webapp/test/engine/triggerTask.test.ts
internal-packages/run-engine/src/engine/tests/triggerAndWait.test.ts
{apps/webapp/**/__tests__/**/*.{ts,tsx},apps/webapp/**/*.{test,spec}.{ts,tsx}}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
{apps/webapp/**/__tests__/**/*.{ts,tsx},apps/webapp/**/*.{test,spec}.{ts,tsx}}
: Do not import app/env.server.ts into tests, either directly or indirectly
Tests should only import classes/functions from files under apps/webapp/app/**/*.ts
Files:
apps/webapp/test/engine/triggerTask.test.ts
**/*.{test,spec}.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{test,spec}.{ts,tsx,js,jsx}
: Unit tests must use Vitest
Tests should avoid mocks or stubs and use helpers from @internal/testcontainers when Redis or Postgres are needed
Test files live beside the files under test and should use descriptive describe and it blocks
Files:
apps/webapp/test/engine/triggerTask.test.ts
internal-packages/run-engine/src/engine/tests/triggerAndWait.test.ts
{apps/webapp/app/**/*.server.{ts,tsx},apps/webapp/app/routes/**/*.ts}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
Access environment variables only via the env export from app/env.server.ts; do not reference process.env directly
Files:
apps/webapp/app/runEngine/services/triggerTask.server.ts
🧠 Learnings (12)
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Define tasks using task({ id, run, ... }) with a unique id per project
Applied to files:
internal-packages/run-engine/src/engine/index.ts
apps/webapp/test/engine/triggerTask.test.ts
apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : For idempotent child-task invocations, create and pass idempotencyKey (and optional TTL) when calling trigger()/batchTrigger() from tasks
Applied to files:
apps/webapp/test/engine/triggerTask.test.ts
internal-packages/run-engine/src/engine/tests/triggerAndWait.test.ts
apps/webapp/app/runEngine/services/triggerTask.server.ts
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use triggerAndWait() only from within a task context (not from generic app code) and handle result.ok or use unwrap() with error handling
Applied to files:
apps/webapp/test/engine/triggerTask.test.ts
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : When triggering a task multiple times in a loop from inside another task, use batchTrigger()/batchTriggerAndWait() instead of per-item trigger() calls
Applied to files:
apps/webapp/test/engine/triggerTask.test.ts
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to trigger.config.ts : Configure global task lifecycle hooks (onStart/onSuccess/onFailure) only within trigger.config.ts if needed, not within arbitrary files
Applied to files:
apps/webapp/test/engine/triggerTask.test.ts
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Export every task (including subtasks) defined with task(), schedules.task(), or schemaTask()
Applied to files:
apps/webapp/test/engine/triggerTask.test.ts
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Do not use client.defineJob or any deprecated v2 patterns (e.g., eventTrigger) when defining tasks
Applied to files:
apps/webapp/test/engine/triggerTask.test.ts
📚 Learning: 2025-08-29T15:49:22.406Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: AGENTS.md:0-0
Timestamp: 2025-08-29T15:49:22.406Z
Learning: Applies to **/*.{test,spec}.{ts,tsx,js,jsx} : Tests should avoid mocks or stubs and use helpers from internal/testcontainers when Redis or Postgres are needed
Applied to files:
apps/webapp/test/engine/triggerTask.test.ts
📚 Learning: 2025-08-29T10:06:49.293Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-08-29T10:06:49.293Z
Learning: Prefer Run Engine 2.0 via internal/run-engine; avoid extending legacy run engine code
Applied to files:
apps/webapp/test/engine/triggerTask.test.ts
internal-packages/run-engine/src/engine/tests/triggerAndWait.test.ts
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Import Trigger.dev APIs from "trigger.dev/sdk/v3" when writing tasks or related utilities
Applied to files:
apps/webapp/test/engine/triggerTask.test.ts
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use schemaTask({ schema, run, ... }) to validate payloads when input validation is required
Applied to files:
apps/webapp/test/engine/triggerTask.test.ts
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use schedules.task(...) for scheduled (cron) tasks; do not implement schedules as plain task() with external cron logic
Applied to files:
apps/webapp/app/runEngine/services/triggerTask.server.ts
🧬 Code graph analysis (5)
internal-packages/run-engine/src/engine/index.ts (2)
packages/core/src/v3/schemas/common.ts (2)
TaskRun
(209-234)TaskRun
(236-236)packages/core/src/v3/isomorphic/friendlyId.ts (1)
RunId
(93-93)
apps/webapp/test/engine/triggerTask.test.ts (3)
apps/webapp/app/runEngine/types.ts (2)
TriggerRacepointSystem
(162-164)TriggerRacepoints
(160-160)packages/core/src/utils.ts (1)
promiseWithResolvers
(26-40)internal-packages/testcontainers/src/utils.ts (1)
assertNonNullable
(173-176)
internal-packages/run-engine/src/engine/systems/waitpointSystem.ts (1)
packages/core/src/v3/isomorphic/friendlyId.ts (1)
WaitpointId
(95-95)
internal-packages/run-engine/src/engine/tests/triggerAndWait.test.ts (2)
internal-packages/testcontainers/src/index.ts (3)
containerTest
(233-241)prisma
(91-112)redisOptions
(132-165)internal-packages/run-engine/src/engine/tests/setup.ts (2)
setupAuthenticatedEnvironment
(21-82)setupBackgroundWorker
(84-293)
apps/webapp/app/runEngine/services/triggerTask.server.ts (1)
apps/webapp/app/runEngine/types.ts (2)
TriggerRacepointSystem
(162-164)TriggerRacepoints
(160-160)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: typecheck / typecheck
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (13)
apps/webapp/app/runEngine/types.ts (1)
159-164
: Public racepoint API looks good; tiny typing nitConsider using a type alias instead of an interface per our TS guideline, but the surface is fine as-is.
internal-packages/run-engine/src/engine/systems/waitpointSystem.ts (1)
741-757
: Good: pure builder for associated RUN waitpointThis cleanly decouples payload construction from persistence and is safe for nested creates. Omitting completedByTaskRunId is correct since Prisma sets it via the nested relation.
internal-packages/run-engine/src/engine/tests/triggerAndWait.test.ts (2)
7-7
: Importing the specific error improves assertion clarityImport looks right and keeps the test precise about the failure mode.
457-616
: Solid coverage for duplicate idempotency key across parentsThe test emulates the race and correctly expects RunDuplicateIdempotencyKeyError. Nice addition.
internal-packages/run-engine/src/engine/index.ts (2)
384-472
: Atomic nested associatedWaitpoint creation — niceCreating the waitpoint inline with TaskRun ensures the parent-blocking waitpoint always exists without a race. One question: do we need to create associatedWaitpoint for every run, or only when resumeParentOnCompletion is true? If unconditional is intentional for future features, all good; otherwise we could gate it to reduce writes.
506-518
: Correct: block parent using newly created waitpoint within same txUsing taskRun.associatedWaitpoint.id and passing tx preserves atomicity and eliminates TOCTOU gaps.
apps/webapp/test/engine/triggerTask.test.ts (4)
19-19
: Good: use testcontainers helpersUsing containerTest/assertNonNullable aligns with our testing guidance.
34-39
: Types import OKImporting TriggerRacepoints/TriggerRacepointSystem from ~/runEngine/types keeps tests aligned with the public surface.
42-42
: Verify Node timers usage in test envnode:timers/promises is Node-specific. Confirm vitest runs in a Node environment (not jsdom/happy-dom) in CI for this suite.
342-563
: Great end‑to‑end test for engine duplicate idempotency behaviorThe racepoint-coordinated double trigger asserts both parents block on the same child. Nice use of the mock racepoint system.
apps/webapp/app/runEngine/services/triggerTask.server.ts (3)
40-41
: LGTM: new racepoint types imported correctly.
No issues with the typing or import path.
63-63
: LGTM: DI for TriggerRacepointSystem.
Keeps the behavior pluggable and testable.
210-216
: Re-check idempotency cache after racepoint wait to avoid queue‑limit false negatives.File: apps/webapp/app/runEngine/services/triggerTask.server.ts — re-query idempotency cache after wait and return cached result if present.
if (idempotencyKey) { - await this.triggerRacepointSystem.waitForRacepoint({ - racepoint: "idempotencyKey", - id: idempotencyKey, - }); + await this.triggerRacepointSystem.waitForRacepoint({ + racepoint: "idempotencyKey", + id: idempotencyKey, + }); + // Re-check cache after coordination; if another request completed the run, return it now. + const postWait = await this.idempotencyKeyConcern.handleTriggerRequest(triggerRequest); + if (postWait.isCached) { + return postWait; + } }Optional observability wrap:
- await this.triggerRacepointSystem.waitForRacepoint({ + await startSpan(this.tracer, "racepoint.wait", async (s) => { + s.setAttribute("racepoint", "idempotencyKey"); + await this.triggerRacepointSystem.waitForRacepoint({ racepoint: "idempotencyKey", id: idempotencyKey, - }); + }); + });Confirm IdempotencyKeyConcern.handleTriggerRequest is safe to call twice for the same request (read‑only or idempotent side‑effects). If not safe, add a read‑only cache lookup instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nice
The tests in this PR don't actually test and verify the race condition, but they do test and verify that trying to create two runs with the same idempotency key, with one winning and one throwing a prisma p2002 error, results in the correct behavior. I originally thought this was what was causing the issue, and we didn't have tests for that behavior, so I added them. But the tests passed, leading me to find what the real cause of the issue was: a race between finding the task run with an idempotency key, and the associated waitpoint of that run being created. The fix was the create the associated waitpoint in a prisma nested write, which is in a transaction.