-
Notifications
You must be signed in to change notification settings - Fork 88
Phil/materialize stats txn #466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Previously, materialization stats had been published using PublishCommitted, which allowed for the possibility of publishing stats multiple times for the same transaction. This commit fixes that, so that they are published as part of the transaction, just like they are for captures and derivations. With this, all task stats are now guaranteed to reflect exactly what was processed successfully.
jgraettinger
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Reviewed 5 of 5 files at r1, 2 of 2 files at r2, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @psFried)
go.sum line 504 at r1 (raw file):
go.gazette.dev/core v0.89.1-0.20220212151322-e7c18c1b78cc h1:GyR68JB114+qvRCqqD4Isnm8wedXq3DTmYkdr9x+9g8= go.gazette.dev/core v0.89.1-0.20220212151322-e7c18c1b78cc/go.mod h1:c/5g5752X3AH+g1ItiQaq9x+gmCFRmdTCFSkp5hypVQ= go.gazette.dev/core v0.89.1-0.20220418210632-87abd3ec00c7 h1:R5KScja+3FLSeTYdOZXQF9nlAAr2lOi0esCf9g08TtI=
nit: go mod tidy
go/runtime/capture.go line 349 at r2 (raw file):
} } else { c.Log(logrus.InfoLevel, nil, "capture transaction committing updating driver checkpoint only")
This should probably be omitted or made DEBUG. I don't think it should be INFO level, and there's already debug logging within the consumer package around transaction lifecycles (though, come to think of it, that's probably going to stderr and not the ops log...)
It's possible for a capture transaction to update the driver checkpoint without actually capturing any new documents. This actually seems to be fairly common in practice, and previously this resulted in a bunch of stats documents with zeros for all the capture bindings. While this is potentially useful to indicate that there was some transaction completed successfully, the tradeoff is that it requires the capture stats to include zero values for all bindings instead of just the ones that had documents in the current transaction. Otherwise, the `capture` property of the stats document would not be serialized and would violate the expectation that `capture` is non-null if `kind` is `capture`. With this commit, capture stats are only published when at least one document has been added to a collection, and stats will only include bindings that actually participate in the transaction. When a connector commits a driver checkpoint only (without any documents), then we just log that event so that we can still tell when transactions are happening.
133338a to
c44cf9c
Compare
psFried
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 5 of 7 files reviewed, 1 unresolved discussion (waiting on @jgraettinger)
go/runtime/capture.go line 349 at r2 (raw file):
Previously, jgraettinger (Johnny Graettinger) wrote…
This should probably be omitted or made DEBUG. I don't think it should be INFO level, and there's already debug logging within the
consumerpackage around transaction lifecycles (though, come to think of it, that's probably going to stderr and not the ops log...)
I was on the fence on debug vs info anyway, so I'll just drop it to debug. I just wanted to make sure we could put something in the logs to show that things worked ok even when the capture connector doesn't spit out any data. Speculating that we might commonly need to differentiate having no new data vs something being broken.
Description:
Rolls up a few fixes for task stats. See individual commit messages.
Also bumps the Gazette dependency to the latest master, which pulls in gazette/core#319 and gazette/core#320
Workflow steps:
Users don't have to do anything different, but now their stats will be more correct and easier to work with.
Documentation links affected:
It's probably still a bit premature to document this.
Notes for reviewers: n/a
This change is