Codestin Search App

gauravkghildiyal · 2025-10-12T01:56:30Z

The rate limiter for publishing inventory state to the Kubernetes API server is now configurable. The default minimum interval between ResourceSlice updates has been reduced from 5 seconds to 2 seconds, with a burst of 5. This ensures that (only) when inventory changes, i.e.

dranet/pkg/inventory/db.go

Line 116 in 51b8a6c

if err := netlink.LinkSubscribe(nlChannel, doneCh); err != nil {

the updates are propagated to the cluster more quickly, reducing pod scheduling delays caused by stale resource data.
An on-demand scan is now triggered if a requested device is not found in the local cache. This resolves allocation failures caused by race conditions where a device is released and immediately re-claimed (by a different pod). The scan instantly updates the driver's local state, ensuring newly freed devices are immediately available for allocation, independent of the rate-limited ResourceSlice update.

This commit introduces configurable rate limiting for the inventory discovery process. Previously, a fixed 5-second rate limit with a burst of 1 could delay processing of netlink updates, leading to failures during high pod churn scenarios. Command-line flags have been added to control the inventory discovery rate limit and burst size. The default values have been adjusted to be more responsive to rapid pod lifecycle events, ensuring that device state is updated promptly.

Previously, if a device was released by a pod and immediately claimed by another, the inventory might not have had a chance to update. Now, if a device is not found in the local store, a new scan is triggered to ensure that newly available devices are discovered before failing. This improves the reliability of device allocation during high pod churn.

pkg/inventory/db.go

gauravkghildiyal requested review from aojea and michaelasp October 12, 2025 01:56

gauravkghildiyal changed the title ~~feat: Add configurable discovery rate limiting~~ [Not ready for review] feat: Add configurable discovery rate limiting Oct 12, 2025

gauravkghildiyal marked this pull request as draft October 12, 2025 05:35

gauravkghildiyal removed request for aojea and michaelasp October 12, 2025 05:35

gauravkghildiyal force-pushed the device-db-burst branch from 77ca76d to c8edd5e Compare October 12, 2025 19:42

gauravkghildiyal changed the title ~~[Not ready for review] feat: Add configurable discovery rate limiting~~ feat: Improve device allocation reliability in high churn scenarios Oct 12, 2025

gauravkghildiyal marked this pull request as ready for review October 12, 2025 19:55

gauravkghildiyal requested review from aojea and michaelasp October 12, 2025 19:55

gauravkghildiyal added 2 commits October 12, 2025 19:56

gauravkghildiyal force-pushed the device-db-burst branch from c8edd5e to c1adaaf Compare October 12, 2025 19:57

michaelasp reviewed Oct 13, 2025

View reviewed changes

pkg/inventory/db.go Show resolved Hide resolved

michaelasp approved these changes Oct 13, 2025

View reviewed changes

gauravkghildiyal merged commit d014973 into google:main Oct 13, 2025
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Improve device allocation reliability in high churn scenarios#253

feat: Improve device allocation reliability in high churn scenarios#253
gauravkghildiyal merged 2 commits intogoogle:mainfrom
gauravkghildiyal:device-db-burst

gauravkghildiyal commented Oct 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gauravkghildiyal commented Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gauravkghildiyal commented Oct 12, 2025 •

edited

Loading