Codestin Search App

3.0.1.45

[LI-HOTFIX] Add Zookeeper pagination support for /brokers/topics znode (

linkedin#435)

TICKET = LIKAFKA-49497

LI_DESCRIPTION =
Switch from Apache to LI Zookeeper dependency and add a GetAllChildrenPaginated option for the /brokers/topics znode (which supports 'list topics' responses greater than 1 MB). The feature is controlled by a new li.zookeeper.pagination.enable config (default = false), with the intention that it be enabled only for critical clusters, at least until it's proven itself in battle.

EXIT_CRITERIA = When this change is accepted upstream and pulled to this repo.

Feb 11, 2023
a4eefc9
zip
tar.gz

3.0.1.44

[LI-HOTFIX] Log lastCaughtUpTime on ISR shrinkage (linkedin#432)

TICKET = N/A
EXIT_CRITERIA = When upstream also log similar info

Feb 2, 2023
3535ab7
zip
tar.gz

3.0.1.43

[LI-HOTFIX] Add metric for total connection count (linkedin#430)

This metric is added for showing the total client connection count to a broker. The metric will be useful when we monitor the connection count and measure the max connections of a broker.


TICKET = N/A
LI_DESCRIPTION = LIKAFKA-49259
EXIT_CRITERIA = When upstream implement similar sensors

Jan 25, 2023
21cece1
zip
tar.gz

3.0.1.42

[LI-HOTFIX] Support full LeaderAndISR through LiCombinedControl reque…

…sts (linkedin#427)

TICKET = LIKAFKA-49560
LI_DESCRIPTION =
As described in the ticket, we found that the LiCombinedControl requests can be disabled for newly
added brokers. The newly added brokers experience the following behavior

1. the LiCombinedContrel requests are enabled when the broker starts up and host 0 replicas
2. the LiCombinedControl requests are disabled when some replicas are assigned to the broker
3. the LiCombinedControl requests can only be re-enabled after the broker is restarted

The reason for the problem in step 2 is that once the LiCombinedControl request is enabled
and a full LeaderAndISR request needs to be sent, it will try to merge the request while
doesn't honor the full request type. As a result, brokers receive the LeaderAndISR as
part of the LiCombinedControl, and treat it as an incremental LeaderAndISR instead of a full
request.

This PR tries to address the problem by having the full support of the LeaderAndISR request type
within LiCombinedControl.

EXIT_CRITERIA = The same as the LiCombinedControl request.

Jan 13, 2023
8c5b1ec
zip
tar.gz

3.0.1.41

[LI-HOTFIX] Broker to controller request should not use cached contro…

…ller node (linkedin#425)

TICKET = LIKAFKA-49304
LI_DESCRIPTION =
Per the slack discussions
https://linkedin-randd.slack.com/archives/C014EKBE170/p1669951054188429?thread_ts=1667860160.319959&cid=C014EKBE170,
the current implementation of broker-to-controller requests results in Unauthorized errors if the cached controller node has been migrated to a different cluster and is still alive.

The impact is that any broker-to-controller requests, including the AlterISR requests, will be blocked, resulting in the permanent inconsistency of ISR info. We should handle such migrated controllers gracefully and use the correct controller instead of the previously cached obsolete controller.

This PR has the following changes

replace the "li.alter.isr.enable" config with the "li.deny.alter.isr" config, since the former is no longer needed and the latter is used for constructing an integration test to reproduce the problem above
change the logic in BrokerToControllerRequestThread such that we always query the latest controller node when a request is constructed. Doing this should not result in performance degradation since the ControllerNodeProvider is either a MetadataCacheControllerNodeProvider or RaftControllerNodeProvider, both of which retrieves the controller from the local cache.
EXIT_CRITERIA = When this change is accepted upstream and pulled into this repo.

Dec 21, 2022
83602e4
zip
tar.gz

3.0.1.40

Add support for request TotalTimeMs latency histograms (linkedin#423)

TICKET = LIKAFKA-47556 Establish Kafka Server SLOs
LI_DESCRIPTION =
This PR is to add support for request TotalTimeMs latency histograms such that we could counter the number of requests in different latency ranges. The bin boundaries are configurable.

EXIT_CRITERIA = N/A

Dec 7, 2022
e99f81c
zip
tar.gz

3.0.1.39

[LI-HOTFIX] Reject invalid replica assignment cancellations (linkedin…

…#422)

TICKET = https://issues.apache.org/jira/browse/KAFKA-14424
LI_DESCRIPTION =
When reassigning replicas, kafka runs a sanity check to ensure all of the target replicas are alive
before allowing the reassignment request to proceed.
However, for a request that cancels an ongoing reassignment, there is no such check.
The result is that if the original replicas are offline, the cancellation may result in partitions
without any leaders. This problem has been observed multiple times in our clusters.

This PR adds the sanity check to ensure all of the original replicas are online before approving the
cancellation request.

EXIT_CRITERIA = When the issue is resolved in Apache kafka and the fix is pulled in.

Nov 30, 2022
84bede3
zip
tar.gz

3.0.1.38

[LI-FIXUP] Populate the error fields of the LiCombinedControlResponse…

… properly (linkedin#421)

TICKET = N/A
LI_DESCRIPTION =
This reverts commit a2ac1c2 (linkedin#408)
The original commit is incorrect and is a backward incompatible schema change on the LiCombinedControlResponse.json.
This PR reverts the original schema change, and addresses the original problem properly:
1. when the LiCombinedControl request version is below 1, the response populates the
LeaderAndIsrPartitionErrors field with the error code. When the version is at or greather than 1, it
populates the LeaderAndIsrTopics field.

2. When the LiCombinedControl request version is below 1, the StopReplicaPartitionErrors field of
the LiCombinedControlResponse should be populated according to the StopReplicaPartitionStates of the
LiCombinedControlRequest. When the version is at or greather than 1, the StopReplicaPartitionErrors
field should be populated according to the StopReplicaTopicStates of the LiCombinedControlRequest.

EXIT_CRITERIA = The same as the LiCombinedControlRequest feature.

Nov 29, 2022
021a03a
zip
tar.gz

3.0.1.37

Exclude the fetch requests with large fetch.max.wait.ms in SizeBucket…

…Metrics linkedin#418

TICKET = LIKAFKA-47556 Establish Kafka Server SLOs
LI_DESCRIPTION =
This PR is to exclude the fetch requests that has fetch.max.wait.ms greater than the default setting for SizeBucketMetrics, otherwise the P999 metrics do not reflect the broker performance correctly because P999 could be just maxWait in the condition that there isn't sufficient data to immediately satisfy the requirement given by fetch.min.bytes for some of the time and maxWait is set to a large number (e.g., 30 seconds).

EXIT_CRITERIA = N/A

Nov 22, 2022
a95d648
zip
tar.gz

3.0.1.36

Exclude the fetch requests with large fetch.max.wait.ms in SizeBucket…

…Metrics linkedin#418

TICKET = LIKAFKA-47556 Establish Kafka Server SLOs
LI_DESCRIPTION =
This PR is to exclude the fetch requests that has fetch.max.wait.ms greater than the default setting for SizeBucketMetrics, otherwise the P999 metrics do not reflect the broker performance correctly because P999 could be just maxWait in the condition that there isn't sufficient data to immediately satisfy the requirement given by fetch.min.bytes for some of the time and maxWait is set to a large number (e.g., 30 seconds).

EXIT_CRITERIA = N/A

Nov 22, 2022
a95d648
zip
tar.gz

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

3.0.1.45

3.0.1.44

3.0.1.43

3.0.1.42

3.0.1.41

3.0.1.40

3.0.1.39

3.0.1.38

3.0.1.37

3.0.1.36

Uh oh!

Tags: Q1Liu/kafka