-
Notifications
You must be signed in to change notification settings - Fork 1.5k
plugin/status: support graceful shutdown timeout #7575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Updated the status plugin `Stop` function be aware of a possible context timeout and attempt one last status update before shutting down. Signed-off-by: sspaink <[email protected]>
Signed-off-by: sspaink <[email protected]>
✅ Deploy Preview for openpolicyagent ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a question on the WIP
Signed-off-by: sspaink <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few comments/questions.
for { | ||
select { | ||
case update := <-ts.updates: | ||
case update := <-updates: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before we were blocking a bunch of server responses here, which then would hold up the client-side? But this would be fine because each status update would just block and get uploaded eventually?
Now the plugin will be more eager to drop status updates if the server-side is slow to respond?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, the idea was to keep the behavior consistent to how the bundle status updates now work. This change isn't required to support timeouts so I could reverse it until someone asks for it, but it seems like it could cause problems?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to reverse the change to update discoCh
and decisionLogsCh
to buffered channels. Seeing it isn't required to resolve the graceful shutdown issue, probably better to leave it be until it can actually be proven to be a problem.
v1/plugins/status/plugin.go
Outdated
func (p *Plugin) readBundleStatus() bool { | ||
var changed bool | ||
if len(p.pluginStatusCh) != 0 { | ||
p.lastPluginStatuses = <-p.pluginStatusCh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on the status of loop()
could this - and below - turn into a blocking read if loop()
manages to squeeze in a read between if len(p.pluginStatusCh) != 0
and here? Or are we confident this won't happen?
Should these be individual select
blocks? E.g.
select {
case status <- p.pluginStatusCh:
p.lastPluginStatuses = status
changed = true
default:
}
select {
case status <- p.bulkBundleCh:
p.lastBundleStatuses = status
changed = true
default:
}
etc ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! Updated to use select
instead to avoid any possibility of blocking.
if !result.Equal(exp) { | ||
t.Fatalf("Expected: %v but got: %v", exp, result) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to add a test for when the plugin needs to abort/drop an update. E.g. if the server doesn't respond in a timely manner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a test TestSlowServer
, starts a server that blocks and gives time to update the bundle status twice and checking if the first status is dropped.
Signed-off-by: sspaink <[email protected]>
Signed-off-by: Sebastian Spaink <[email protected]>
Signed-off-by: sspaink <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! 👍
Why the changes in this PR are needed?
resolve: #6676
What are the changes in this PR?
Updated the status plugin
Stop
function be aware of a possible context timeout and attempt one last status update before shutting down. Modeled after the way the decisions log plugin shuts down.Also added a similar change as #7522, that prevents status updates from blocking I updated the remaining channels to also be non-blocking. Oversight from the previous change but works well here so that the length of the channels can be checked for the final status update.
Hoping that the added
Status Plugin stopped with statuses possibly not sent.
log message will help users find out if the--shutdown-wait-period
needs to be extended.