-
Notifications
You must be signed in to change notification settings - Fork 54
Remove inmemory cache from watermill event repository #818
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughEvent storage moved from an in-memory cache to SQL persistence: the Watermill event repository now accepts a DB handle and reads historical events from the database. The exported Changes
Sequence Diagram(s)sequenceDiagram
participant Pub as Publisher
participant Repo as WatermillRepo
participant DB as SQLDB
participant Sub as Subscriber
rect rgb(220,240,255)
Note over Pub,Repo: Previous (in-memory cache)
Pub->>Repo: Publish(event)
Repo->>Repo: store in eventCache
Sub->>Repo: Subscribe(topic)
Repo->>Repo: read from eventCache
Repo->>Sub: deliver cached events
end
rect rgb(240,255,240)
Note over Pub,Repo: New (DB-backed)
Pub->>Repo: Publish(event)
Repo->>DB: insert into watermill_<topic>
Sub->>Repo: Subscribe(topic)
Repo->>DB: getAllEvents(topic, id)
DB->>Repo: return payloads
Repo->>Repo: deserializeEvent(payload)
Repo->>Sub: deliver deserialized events
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (2)
internal/infrastructure/db/watermill/event_repo.go (2)
189-240: Optional: Consider reducing repetition in event deserialization.The switch statement has significant code duplication across all event type cases. While the current explicit approach is type-safe and clear, you could reduce repetition using a map-based approach.
Here's an example refactor using a registry pattern:
var eventTypeRegistry = map[domain.EventType]func() domain.Event{ domain.EventTypeRoundStarted: func() domain.Event { return &domain.RoundStarted{} }, domain.EventTypeRoundFinalizationStarted: func() domain.Event { return &domain.RoundFinalizationStarted{} }, domain.EventTypeRoundFinalized: func() domain.Event { return &domain.RoundFinalized{} }, domain.EventTypeRoundFailed: func() domain.Event { return &domain.RoundFailed{} }, domain.EventTypeBatchSwept: func() domain.Event { return &domain.BatchSwept{} }, domain.EventTypeIntentsRegistered: func() domain.Event { return &domain.IntentsRegistered{} }, domain.EventTypeOffchainTxRequested: func() domain.Event { return &domain.OffchainTxRequested{} }, domain.EventTypeOffchainTxAccepted: func() domain.Event { return &domain.OffchainTxAccepted{} }, domain.EventTypeOffchainTxFinalized: func() domain.Event { return &domain.OffchainTxFinalized{} }, domain.EventTypeOffchainTxFailed: func() domain.Event { return &domain.OffchainTxFailed{} }, } func deserializeEvent(buf []byte) (domain.Event, error) { var eventType struct { Type domain.EventType } if err := json.Unmarshal(buf, &eventType); err != nil { return nil, fmt.Errorf("failed to extract event type: %w", err) } factory, ok := eventTypeRegistry[eventType.Type] if !ok { return nil, fmt.Errorf("unknown event type: %s", eventType.Type) } event := factory() if err := json.Unmarshal(buf, event); err != nil { return nil, fmt.Errorf("failed to unmarshal %s: %w", eventType.Type, err) } return event, nil }This reduces code duplication and makes it easier to add new event types, though it does introduce a bit of indirection.
116-119: SQL injection risk in dynamic table name is LOW in current codebase.Verification confirms that the
topicparameter is never user-controlled. All production calls toSave()use hardcoded domain constants (RoundTopic="round",OffchainTxTopic="offchain_tx"). The call chain is:Save(topic)→dispatch(topic)→getAllEvents(topic), with topic originating exclusively from internal constants.However, the string interpolation pattern (line 117) does violate secure coding practices. While currently safe, consider applying defensive validation to prevent future vulnerabilities if this code path is ever modified to accept external input:
// Add this validation helper func isValidTopicName(topic string) bool { matched, _ := regexp.MatchString(`^[a-zA-Z0-9_]+$`, topic) return matched } // Add this check in getAllEvents if !isValidTopicName(topic) { return nil, fmt.Errorf("invalid topic name: %s", topic) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
internal/core/domain/offchain_tx_event.go(0 hunks)internal/infrastructure/db/postgres/event_repo.go(1 hunks)internal/infrastructure/db/service_test.go(0 hunks)internal/infrastructure/db/watermill/cache.go(0 hunks)internal/infrastructure/db/watermill/event_repo.go(4 hunks)
💤 Files with no reviewable changes (3)
- internal/core/domain/offchain_tx_event.go
- internal/infrastructure/db/service_test.go
- internal/infrastructure/db/watermill/cache.go
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: louisinger
Repo: arkade-os/arkd PR: 693
File: internal/core/application/service.go:2379-2388
Timestamp: 2025-08-20T06:26:18.377Z
Learning: In the arkd codebase, transaction events should never be dropped when propagating to channels. The interface layer is designed to read events immediately, so buffering channels is preferred over non-blocking sends with drop semantics to maintain delivery guarantees.
🧬 Code graph analysis (2)
internal/infrastructure/db/postgres/event_repo.go (1)
internal/infrastructure/db/watermill/event_repo.go (1)
NewWatermillEventRepository(29-36)
internal/infrastructure/db/watermill/event_repo.go (2)
internal/core/domain/round_event.go (4)
RoundStarted(17-20)RoundFinalizationStarted(22-30)BatchSwept(50-57)IntentsRegistered(45-48)internal/core/domain/offchain_tx_event.go (4)
OffchainTxRequested(13-18)OffchainTxAccepted(20-27)OffchainTxFinalized(29-33)OffchainTxFailed(35-39)
🔇 Additional comments (4)
internal/infrastructure/db/postgres/event_repo.go (1)
34-34: LGTM: Constructor call updated correctly.The additional
dbparameter is passed to align with the new database-backed event storage approach.internal/infrastructure/db/watermill/event_repo.go (3)
5-5: LGTM: Necessary imports and struct field added.The new imports (
database/sql,fmt,logrus) anddbfield support the migration from in-memory caching to database-backed event storage.Also applies to: 7-7, 13-13, 23-23
29-36: LGTM: Constructor signature updated correctly.The constructor now accepts the database handle and properly initializes all fields including the new
dbfield.
117-117: The review comment is based on an incorrect assumption.OffchainTxAccepted does have an
Idfield—it's inherited from the embeddedOffchainTxEventstruct (line 21 of offchain_tx_event.go). Through Go's embedding mechanism, all fields fromOffchainTxEvent, includingId, are automatically promoted toOffchainTxAccepted. When events are serialized viajson.Marshal()(line 166 of event_repo.go), all exported fields from embedded structs are included in the JSON payload. Therefore, the query at line 117 will correctly filter events bypayload->>'Id'for all event types, includingOffchainTxAccepted.Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
internal/test/e2e/e2e_test.go (2)
1627-1629: Consider replacing fixed delays with polling for server state.The addition of fixed 5-second delays (and similar delays throughout the file) to wait for server-side processing introduces timing dependencies that could make tests flaky in slower environments or under load.
Consider implementing a polling mechanism that checks for the expected server state with a timeout:
// Example helper function to add to the test file func waitForCondition(t *testing.T, timeout time.Duration, condition func() bool, description string) { deadline := time.Now().Add(timeout) for time.Now().Before(deadline) { if condition() { return } time.Sleep(100 * time.Millisecond) } require.Fail(t, fmt.Sprintf("timeout waiting for: %s", description)) }Then replace fixed sleeps with:
// Give time for the server to detect and process the fraud waitForCondition(t, 10*time.Second, func() bool { spentStatus, err := expl.GetTxOutspends(vtxo.Txid) if err != nil || len(spentStatus) <= int(vtxo.VOut) { return false } return spentStatus[vtxo.VOut].Spent }, "vtxo to be spent by forfeit tx")
1458-1467: The error capture pattern is correct but could benefit from documentation.Throughout the file,
incomingErr(and similar variables) are written in goroutines and read in the main test thread afterwg.Wait(). While this is safe due to the happens-before relationship established byWaitGroup, the pattern relies on understanding these semantics.Consider adding a comment where the pattern is first introduced to document the synchronization approach:
// incomingErr is safely shared between goroutines due to WaitGroup synchronization. // The goroutine writes the error, wg.Done() creates a happens-before relationship, // and the main thread reads after wg.Wait(), ensuring no data race. var incomingErr errorThis helps future maintainers understand the synchronization guarantees.
Also applies to: 1555-1565, 1679-1688
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
internal/test/e2e/e2e_test.go(8 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-08-28T08:21:01.170Z
Learnt from: louisinger
Repo: arkade-os/arkd PR: 686
File: internal/core/application/fraud.go:47-61
Timestamp: 2025-08-28T08:21:01.170Z
Learning: In reactToFraud function in internal/core/application/fraud.go, the goroutine that waits for confirmation and schedules checkpoint sweep should use context.Background() instead of the request context, as this is intentional design to decouple the checkpoint sweep scheduling from the request lifetime.
Applied to files:
internal/test/e2e/e2e_test.go
📚 Learning: 2025-08-19T10:58:41.042Z
Learnt from: louisinger
Repo: arkade-os/arkd PR: 691
File: internal/core/application/service.go:557-562
Timestamp: 2025-08-19T10:58:41.042Z
Learning: In the arkd SubmitOffchainTx method, using the checkpoint PSBT input's tapscript (forfeit path) for the VtxoInput.Tapscript field is the correct behavior, not a bug as initially thought. The system correctly handles the relationship between checkpoint inputs and Ark transaction inputs.
Applied to files:
internal/test/e2e/e2e_test.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: integration tests
- GitHub Check: Build and Scan
- GitHub Check: unit tests
🔇 Additional comments (1)
internal/test/e2e/e2e_test.go (1)
102-126: ****The concern about loss of test coverage for
incomingFundsvalidation is not substantiated. Investigation reveals:
incomingFundsIS validated elsewhere: Multiple other tests in the same file validate the return value (lines 498, 515, 532, 549, 783, 2037, 2061 userequire.NotNil()orrequire.NotEmpty()).No evidence of removed assertions: No assertions on
incomingFundsexist in the current test file. The discarding pattern at lines 104, 108 may have been the original design for this test.Test scope is settlement flow, not vtxo validation: TestBatchSession (lines 102-126) tests batch settlement between parties—it validates error handling and commitment transaction equality, not the content of incoming vtxos. Other tests specifically designed for vtxo validation capture and assert on
incomingFunds.The refactored code does not reduce test coverage; it maintains appropriate separation of concerns across tests.
Likely an incorrect or invalid review comment.
ff3e7f4 to
a33c4e8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (3)
internal/infrastructure/db/watermill/event_repo.go (3)
88-93: Critical: Context propagation ignored in dispatch.The
dispatchmethod usescontext.Background()(line 90) instead of accepting and propagating the caller's context. This prevents timeout/cancellation propagation and was previously flagged in review comments.Update the signature to accept context:
-func (e *eventRepository) dispatch(topic string, id string) error { +func (e *eventRepository) dispatch(ctx context.Context, topic string, id string) error { // get all events for the topic from the database - events, err := e.getAllEvents(context.Background(), topic, id) + events, err := e.getAllEvents(ctx, topic, id) if err != nil { return err }And update the call site in
Save:- if err := e.dispatch(topic, id); err != nil { + if err := e.dispatch(ctx, topic, id); err != nil { log.WithError(err).Error("failed to dispatch saved events") }
148-156: Major: Deserialization failures silently drop events.When
deserializeEventfails (lines 150-154), the error is logged but the event is silently dropped from the result set. This violates event delivery guarantees and was previously flagged in review comments.Consider failing fast to preserve data integrity:
events := make([]domain.Event, 0, len(records)) for _, record := range records { event, err := deserializeEvent(record) if err != nil { - log.WithError(err).Warnf("failed to deserialize event: %s", string(record)) - continue + return nil, fmt.Errorf("failed to deserialize event: %w", err) } events = append(events, event) }
183-246: Major: Weak error handling masks unmarshal failures.The
deserializeEventfunction silently ignores JSON unmarshal errors. The patternif err := json.Unmarshal(buf, &event); err == nil { return event, nil }means if unmarshal fails, execution continues to check other event types, eventually returning a generic "unknown event" error that doesn't distinguish between unrecognized types and deserialization failures. This was previously flagged in review comments.Make errors explicit:
switch eventType.Type { case domain.EventTypeRoundStarted: var event = domain.RoundStarted{} - if err := json.Unmarshal(buf, &event); err == nil { - return event, nil + if err := json.Unmarshal(buf, &event); err != nil { + return nil, fmt.Errorf("failed to unmarshal RoundStarted: %w", err) } + return event, nil case domain.EventTypeRoundFinalizationStarted: var event = domain.RoundFinalizationStarted{} - if err := json.Unmarshal(buf, &event); err == nil { - return event, nil + if err := json.Unmarshal(buf, &event); err != nil { + return nil, fmt.Errorf("failed to unmarshal RoundFinalizationStarted: %w", err) } + return event, nil // ... apply same pattern to all other cases ... + default: + return nil, fmt.Errorf("unknown event type: %s", eventType.Type) } - - return nil, fmt.Errorf("unknown event")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
internal/infrastructure/db/service.go(1 hunks)internal/infrastructure/db/watermill/event_repo.go(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/infrastructure/db/service.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-28T08:21:01.170Z
Learnt from: louisinger
Repo: arkade-os/arkd PR: 686
File: internal/core/application/fraud.go:47-61
Timestamp: 2025-08-28T08:21:01.170Z
Learning: In reactToFraud function in internal/core/application/fraud.go, the goroutine that waits for confirmation and schedules checkpoint sweep should use context.Background() instead of the request context, as this is intentional design to decouple the checkpoint sweep scheduling from the request lifetime.
Applied to files:
internal/infrastructure/db/watermill/event_repo.go
🧬 Code graph analysis (1)
internal/infrastructure/db/watermill/event_repo.go (2)
internal/core/domain/round_event.go (4)
RoundStarted(17-20)RoundFinalizationStarted(22-30)BatchSwept(50-57)IntentsRegistered(45-48)internal/core/domain/offchain_tx_event.go (4)
OffchainTxRequested(13-18)OffchainTxAccepted(20-27)OffchainTxFinalized(29-33)OffchainTxFailed(35-39)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: unit tests
- GitHub Check: integration tests
- GitHub Check: Build and Scan
🔇 Additional comments (1)
internal/infrastructure/db/watermill/event_repo.go (1)
29-36: LGTM! Constructor properly wired for database integration.The constructor correctly accepts and stores the database handle. The defensive nil check in
getAllEvents(line 115) ensures safe handling if db is not initialized.
@altafan please review
Summary by CodeRabbit
Refactor
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.