Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@psFried
Copy link
Member

@psFried psFried commented Apr 25, 2022

Description:

Rolls up a few fixes for task stats. See individual commit messages.
Also bumps the Gazette dependency to the latest master, which pulls in gazette/core#319 and gazette/core#320

Workflow steps:

Users don't have to do anything different, but now their stats will be more correct and easier to work with.

Documentation links affected:

It's probably still a bit premature to document this.

Notes for reviewers: n/a


This change is Reviewable

Previously, materialization stats had been published using
PublishCommitted, which allowed for the possibility of publishing stats
multiple times for the same transaction. This commit fixes that, so that
they are published as part of the transaction, just like they are for
captures and derivations. With this, all task stats are now guaranteed
to reflect exactly what was processed successfully.
@psFried psFried requested a review from jgraettinger April 25, 2022 15:36
@psFried psFried mentioned this pull request Apr 25, 2022
Copy link
Member

@jgraettinger jgraettinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Reviewed 5 of 5 files at r1, 2 of 2 files at r2, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @psFried)


go.sum line 504 at r1 (raw file):

go.gazette.dev/core v0.89.1-0.20220212151322-e7c18c1b78cc h1:GyR68JB114+qvRCqqD4Isnm8wedXq3DTmYkdr9x+9g8=
go.gazette.dev/core v0.89.1-0.20220212151322-e7c18c1b78cc/go.mod h1:c/5g5752X3AH+g1ItiQaq9x+gmCFRmdTCFSkp5hypVQ=
go.gazette.dev/core v0.89.1-0.20220418210632-87abd3ec00c7 h1:R5KScja+3FLSeTYdOZXQF9nlAAr2lOi0esCf9g08TtI=

nit: go mod tidy


go/runtime/capture.go line 349 at r2 (raw file):

		}
	} else {
		c.Log(logrus.InfoLevel, nil, "capture transaction committing updating driver checkpoint only")

This should probably be omitted or made DEBUG. I don't think it should be INFO level, and there's already debug logging within the consumer package around transaction lifecycles (though, come to think of it, that's probably going to stderr and not the ops log...)

It's possible for a capture transaction to update the driver checkpoint
without actually capturing any new documents. This actually seems to be
fairly common in practice, and previously this resulted in a bunch of
stats documents with zeros for all the capture bindings. While this is
potentially useful to indicate that there was some transaction completed
successfully, the tradeoff is that it requires the capture stats to
include zero values for all bindings instead of just the ones that had
documents in the current transaction. Otherwise, the `capture` property
of the stats document would not be serialized and would violate the
expectation that `capture` is non-null if `kind` is `capture`.

With this commit, capture stats are only published when at least one
document has been added to a collection, and stats will only include
bindings that actually participate in the transaction. When a connector
commits a driver checkpoint only (without any documents), then we just
log that event so that we can still tell when transactions are
happening.
@psFried psFried force-pushed the phil/materialize-stats-txn branch from 133338a to c44cf9c Compare April 28, 2022 18:30
Copy link
Member Author

@psFried psFried left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 5 of 7 files reviewed, 1 unresolved discussion (waiting on @jgraettinger)


go/runtime/capture.go line 349 at r2 (raw file):

Previously, jgraettinger (Johnny Graettinger) wrote…

This should probably be omitted or made DEBUG. I don't think it should be INFO level, and there's already debug logging within the consumer package around transaction lifecycles (though, come to think of it, that's probably going to stderr and not the ops log...)

I was on the fence on debug vs info anyway, so I'll just drop it to debug. I just wanted to make sure we could put something in the logs to show that things worked ok even when the capture connector doesn't spit out any data. Speculating that we might commonly need to differentiate having no new data vs something being broken.

@psFried psFried merged commit 358a640 into master Apr 28, 2022
@psFried psFried deleted the phil/materialize-stats-txn branch April 28, 2022 19:21
@oliviamiannone oliviamiannone added the docs complete / NA No (more) doc work related to this PR label May 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs complete / NA No (more) doc work related to this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants