Thanks to visit codestin.com
Credit goes to github.com

Skip to content

NIFI-15576 ConsumeKinesis processor should log lag metric in milliseconds#10881

Open
lkuchars wants to merge 8 commits intoapache:mainfrom
lkuchars:lkucharski/NIFI-15576-report-lag-from-consume-kinesis
Open

NIFI-15576 ConsumeKinesis processor should log lag metric in milliseconds#10881
lkuchars wants to merge 8 commits intoapache:mainfrom
lkuchars:lkucharski/NIFI-15576-report-lag-from-consume-kinesis

Conversation

@lkuchars
Copy link
Contributor

Summary

NIFI-15576

Tracking

Please complete the following tracking steps prior to pull request creation.

Issue Tracking

Pull Request Tracking

  • Pull Request title starts with Apache NiFi Jira issue number, such as NIFI-00000
  • Pull Request commit message starts with Apache NiFi Jira issue number, as such NIFI-00000
  • Pull request contains commits signed with a registered key indicating Verified status

Pull Request Formatting

  • Pull Request based on current revision of the main branch
  • Pull Request refers to a feature branch with one commit containing changes

Verification

Please indicate the verification steps performed prior to pull request creation.

Build

  • Build completed using ./mvnw clean install -P contrib-check
    • JDK 21
    • JDK 25

Licensing

  • New dependencies are compatible with the Apache License 2.0 according to the License Policy
  • New dependencies are documented in applicable LICENSE and NOTICE files

Documentation

  • Documentation formatting appears as expected in rendered files

Copy link
Contributor

@awelless awelless left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes.
I have 2 main concerns:

  1. Using a minimum for millisBehindLatest seems to be more misleading than using the latest value.
  2. The meaning of the gauge and caveats of using it should be explicitly mentioned in the processor documentation.

@lkuchars lkuchars force-pushed the lkucharski/NIFI-15576-report-lag-from-consume-kinesis branch from 6e1c3ce to 40cb736 Compare February 16, 2026 15:00
Copy link
Contributor

@awelless awelless left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally, only gauge was registered.
Do we need to also write MILLIS_BEHIND_LATEST FlowFile attribute?

Comment on lines +149 to +150
@WritesAttribute(attribute = MILLIS_BEHIND_LATEST,
description = "Milliseconds behind the latest record in the shard at the time records were consumed"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally the issue was about writing millis behind latest in a gauge.
Do we need to write it to FlowFile attributes too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this attribute will be needed downstream, no problems.
But I wouldn't put it in FlowFile attributes if it's only caused by the gauge name format.

…d add flow file attribute aws.kinesis.millis.behind.latest equal to KCL millisBehindLatestValue
@lkuchars lkuchars force-pushed the lkucharski/NIFI-15576-report-lag-from-consume-kinesis branch from 3dc789b to 23118ab Compare February 17, 2026 09:18
Comment on lines +149 to +150
@WritesAttribute(attribute = MILLIS_BEHIND_LATEST,
description = "Milliseconds behind the latest record in the shard at the time records were consumed"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this attribute will be needed downstream, no problems.
But I wouldn't put it in FlowFile attributes if it's only caused by the gauge name format.

Copy link
Contributor

@pvillard31 pvillard31 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor stylish comments.

Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this addition @lkuchars.

The general approach of recording lag is useful, but the naming of millisBehindLatest is not very intuitive. The PR title indicates lag, so something that includes lag seems better, and removing millis, although perhaps less clear, would make the name less verbose. The main issue is the object of latest being unclear. This seems analogous to the currentLag feature of Kafka in some ways, so what do you think about calling it current.lag instead?

@sfc-gh-lkucharski
Copy link

Thanks for working on this addition @lkuchars.

The general approach of recording lag is useful, but the naming of millisBehindLatest is not very intuitive. The PR title indicates lag, so something that includes lag seems better, and removing millis, although perhaps less clear, would make the name less verbose. The main issue is the object of latest being unclear. This seems analogous to the currentLag feature of Kafka in some ways, so what do you think about calling it current.lag instead?

Thanks for the review @exceptionfactory! Yep, current.lag sounds much better. Changes commited. PTAL

Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lkuchars, this looks close to completion, I noted a couple minor comments.

final List<KinesisClientRecord> records = createTestRecords(2);

recordBuffer.addRecords(bufferId, records, checkpointer1);
recordBuffer.addRecords(bufferId, records, checkpointer1, 100L);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the repeated value of 100 can be declared once and reused across all methods

}
}

static String makeCurrentLagGaugeName(final String streamName, final String shardId) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method should be made private and the test class should be changed to have its own expected format instead.


| Name | Type | Description |
|----------------------------------------------------------------------|---------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `aws.kinesis.current.lag[stream.name="<stream>",shard.id="<shard>"]` | Gauge | The number of milliseconds the consumer is behind the tip of the shard, as reported by the Kinesis Client Library. There is one gauge per stream/shard combination. The gauge is updated each time a batch of records is successfully processed and the session is committed. A value of `0` means the consumer is caught up. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description mentions a value of 0, but does not mention the value might not be recorded if it is null. Should that be mentioned?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a mention

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants