Thanks to visit codestin.com
Credit goes to github.com

Skip to content

newlog: get rid of Fatal's in vector.go#5292

Merged
rene merged 1 commit into
lf-edge:masterfrom
europaul:remove-vector-fatals
Oct 9, 2025
Merged

newlog: get rid of Fatal's in vector.go#5292
rene merged 1 commit into
lf-edge:masterfrom
europaul:remove-vector-fatals

Conversation

@europaul

@europaul europaul commented Oct 7, 2025

Copy link
Copy Markdown
Contributor

Description

This addresses a promise from #5008 (comment).

Instead of fataling out when we fail to create the socket listener, we retry forever with a backoff. This is important because if newlogd exits, the watchdog will reboot the whole system, which is not what we want for a transient failure like a missing directory.

PR dependencies

None

How to test and validate this PR

No validation needed, the change is covered by unit and Eden tests.

Changelog notes

N/A since it's just a home work from #5008.

PR Backports

No need to backport since this code was added in EVE 15.

Checklist

  • I've provided a proper description
  • I've added the proper documentation
  • I've tested my PR on amd64 device
  • I've tested my PR on arm64 device
  • I've written the test verification instructions
  • I've set the proper labels to this PR

@europaul europaul requested a review from deitch as a code owner October 7, 2025 16:39
@europaul europaul requested a review from OhmSpectator October 7, 2025 16:40
@europaul europaul added the main-quest The fate of the project rests on this PR. Prioritise review to advance the storyline! label Oct 7, 2025
Comment thread pkg/newlog/cmd/vector.go Outdated
func createVectorSockets(sockPath string, backoffTime time.Duration) *net.UnixListener {
for {
// Create unix socket
os.Remove(sockPath) // Remove any existing socket

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supposing everything passed until the os.Chmod() call... so the loop will continue (returning here) and at this point both unixAddr + unixListerner are valid... shouldn't you close unixListener before remove the socket here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense 👍

Comment thread pkg/newlog/cmd/vector_test.go Outdated
g.Expect(err).To(gomega.BeNil())

// Wait a bit to let the function succeed
time.Sleep(2 * backoffPeriod)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may make the test a bit flaky ...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I'll make them communicate through a channel

time.Sleep(2 * backoffPeriod)

// verify that the listener was created
g.Expect(unixListener).ToNot(gomega.BeNil(), "createVectorSockets should succeed after directory creation")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this in a race condition with the go func lambda above?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I'll make them communicate through a channel

@milan-zededa

Copy link
Copy Markdown
Contributor

@europaul, does this also fix failures like this one: link, or is that a separate issue?

Comment thread pkg/newlog/cmd/vector.go Outdated
func createVectorSockets(sockPath string, backoffTime time.Duration) *net.UnixListener {
for {
// Create unix socket
os.Remove(sockPath) // Remove any existing socket

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicking because it is already there in the old code:
Would be nice to check the error of os.Remove and if it is not ENOENT, then log a warning/error.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay

@christoph-zededa

Copy link
Copy Markdown
Contributor

Thank you for adding a test!

Comment thread pkg/newlog/cmd/vector.go
}
unixListener := createVectorSockets(sockPath, 10*time.Second)
defer unixListener.Close()
defer os.Remove(sockPath)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I've just noticed it. The order of the defers is strange. First, it will delete the socket, then it will attempt to close the connection (deferred tasks are handled in LIFO order).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, fixed

@europaul

europaul commented Oct 8, 2025

Copy link
Copy Markdown
Contributor Author

@europaul, does this also fix failures like this one: link, or is that a separate issue?

I think that's unrelated to fatals

Instead of fataling out when we fail to create the socket listener,
we retry forever with a backoff. This is important because if newlogd
exits, the watchdog will reboot the whole system, which is not what we
want for a transient failure like a missing directory.

Signed-off-by: Paul Gaiduk <[email protected]>
@europaul europaul force-pushed the remove-vector-fatals branch from aecd811 to 64c17a9 Compare October 8, 2025 21:32
Comment thread pkg/newlog/cmd/vector.go
Comment on lines +33 to +35
if err := os.Remove(sockPath); errors.Is(err, os.ErrNotExist) {
// Socket doesn't exist, this is expected
} else if err != nil {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if err := os.Remove(sockPath); errors.Is(err, os.ErrNotExist) {
// Socket doesn't exist, this is expected
} else if err != nil {
if err := os.Remove(sockPath); err != nil && !errors.Is(err, os.ErrNotExist) {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I thought for a minute about this and then I thought my version is more verbose and explicit

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, I am also fine with this if you prefer it.

@rene rene merged commit f78514a into lf-edge:master Oct 9, 2025
47 of 56 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

main-quest The fate of the project rests on this PR. Prioritise review to advance the storyline!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants