Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@xpivarc
Copy link
Member

@xpivarc xpivarc commented Sep 16, 2025

What this PR does

Delivery of signals can actually drop signal
if multiple signals are generated while the signal is being blocked.
In this case only one signal is delivered after it is un-blocked. Because we always do Wait4 per signal we can actually miss a process being terminated.
Sometimes the ordering happen to be unfortunate and the virt-launcher process is missed and never cleaned up.

This cause for the virt-launcher to hang around indefinitely.

Therefore this commit tries to reap as much processes as possible per signal.

This was observed upon successful migration where both source and target Pods continued to be running, see:

ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
qemu           1  0.0  0.0 1679844 13328 ?       Ssl  Aug21   0:00 /usr/bin/virt-launcher-monitor --qemu-timeout 248s --name rhel9-3195 --uid 0db4e8a0-c0e4-4e95-9b1c-bd7f2a16c8d4 --namespace vm-ns-32 --kubevirt-
qemu           8  0.0  0.0      0     0 ?        Z    Aug21  13:51 [virt-launcher] <defunct>
qemu         444  0.5  0.0   4452  2688 pts/0    Ss   09:19   0:00 bash
qemu         445  0.0  0.0   7032  2688 pts/0    R+   09:19   0:00 ps aux

and logs:

{"component":"virt-launcher","level":"info","msg":"Exiting...","pos":"virt-launcher.go:513","timestamp":"2025-08-29T18:18:58.885293Z"}
{"component":"virt-launcher-monitor","level":"info","msg":"Reaped pid 19 with status 9","pos":"virt-launcher-monitor.go:202","timestamp":"2025-08-29T18:18:58.886212Z"}
{"component":"virt-launcher-monitor","level":"info","msg":"Reaped pid 18 with status 9","pos":"virt-launcher-monitor.go:202","timestamp":"2025-08-29T18:18:58.889628Z"}

Links to places where the discussion took place:

Special notes for your reviewer

It is not clear to me if the go runtime suspend the signal (makes it non blocking) before the signal is send to channel and so there is no race.

Release note

Bug fix, virt-launcher is properly reaped

@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Sep 16, 2025
Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@fossedihelm
Copy link
Contributor

Could it be a fix for #15373?

@Barakmor1
Copy link
Member

Happy to see the additional logs and artifacts ended up being useful.

/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Sep 16, 2025
Comment on lines 128 to 130
if wpid == 0 {
log.Log.Infof("No more processes to be repead")
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe we should add a debug log in case wpid < 0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh, I think when syscall.Wait4 returns -1 it will return it with err set - and that is already logged...

@vladikr
Copy link
Member

vladikr commented Sep 16, 2025

It is not clear to me if the go runtime suspend the signal (makes it non blocking) before the signal is send to channel and so there is no race.

I think the new loop already eliminates any possible race since it reaps all the waiting child processes in one go, even if only one SIGCHLD was sent.
/approve

@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vladikr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 16, 2025
@vladikr
Copy link
Member

vladikr commented Sep 16, 2025

/hold

@kubevirt-commenter-bot
Copy link

Required labels detected, running phase 2 presubmits:
/test pull-kubevirt-e2e-k8s-1.31-windows2016
/test pull-kubevirt-e2e-kind-1.33-vgpu
/test pull-kubevirt-e2e-kind-sriov
/test pull-kubevirt-e2e-k8s-1.33-ipv6-sig-network
/test pull-kubevirt-e2e-k8s-1.32-sig-network
/test pull-kubevirt-e2e-k8s-1.32-sig-storage
/test pull-kubevirt-e2e-k8s-1.32-sig-compute
/test pull-kubevirt-e2e-k8s-1.32-sig-operator
/test pull-kubevirt-e2e-k8s-1.33-sig-compute-serial
/test pull-kubevirt-e2e-k8s-1.33-sig-network
/test pull-kubevirt-e2e-k8s-1.33-sig-storage
/test pull-kubevirt-e2e-k8s-1.33-sig-compute
/test pull-kubevirt-e2e-k8s-1.33-sig-operator

@kubevirt-bot kubevirt-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 16, 2025
@Barakmor1
Copy link
Member

It is not clear to me if the go runtime suspend the signal (makes it non blocking) before the signal is send to channel and so there is no race.

I'm not completely sure, but I think the race happens when multiple child processes finish before the first signal is received. Since signals are just flags and aren't queued, we only get one signal, even if more than one child has exited. That could explain the issue, though it's a rare case.

Delivery of signals can actually drop signal
if multiple signals are generated while the signal
is being blocked.
In this case only one signal is delivered after it is un-blocked.
Because we always do Wait4 per signal we can actually miss
a process being terminated.
Sometimes the ordering happen to be unfortunate and
the virt-launcher process is missed and never cleaned up.

This cause for the virt-launcher to hang around indefinitely.

Therefore this commit tries to reap as much processes as possible per
signal.

Signed-off-by: Luboslav Pivarc <[email protected]>
@kubevirt-bot kubevirt-bot removed the lgtm Indicates that a PR is ready to be merged. label Sep 17, 2025
@xpivarc
Copy link
Member Author

xpivarc commented Sep 17, 2025

It is not clear to me if the go runtime suspend the signal (makes it non blocking) before the signal is send to channel and so there is no race.

I'm not completely sure, but I think the race happens when multiple child processes finish before the first signal is received. Since signals are just flags and aren't queued, we only get one signal, even if more than one child has exited. That could explain the issue, though it's a rare case.

Yes, this is exactly what is happening but the think I describe is the abstraction that Go does. It matters if the signal is processed first and the sig is send to channel or the sig can be send to channel before the signal was processed. In the latter case we can end the loop, new child dies and we miss it.

Anyway I did small test

diff --git a/pkg/virt-launcher/monitor_test.go b/pkg/virt-launcher/monitor_test.go
index e2879b4963..97cfd3ecdb 100644
--- a/pkg/virt-launcher/monitor_test.go
+++ b/pkg/virt-launcher/monitor_test.go
@@ -21,7 +21,9 @@ package virtlauncher
 
 import (
 	"flag"
+	"os"
 	"os/exec"
+	"os/signal"
 	"path/filepath"
 	"strings"
 	"syscall"
@@ -29,6 +31,7 @@ import (
 
 	. "github.com/onsi/ginkgo/v2"
 	. "github.com/onsi/gomega"
+	"kubevirt.io/client-go/log"
 
 	"github.com/google/uuid"
 )
@@ -108,8 +111,58 @@ var _ = Describe("VirtLauncher", func() {
 	AfterEach(func() {
 		if processStarted {
 			stopProcess()
+			_ = cmd.Wait()
 		}
-		_ = cmd.Wait()
+
+	})
+
+	FIt("t", func() {
+		start := func() *exec.Cmd {
+			cmd := exec.Command(fakeQEMUBinary, "--uuid", uuid.New().String(), "--pidfile", filepath.Join(pidDir, "fakens_fakevmi.pid"))
+			err := cmd.Start()
+			ExpectWithOffset(1, err).ToNot(HaveOccurred(), "command failed to start")
+			return cmd
+		}
+
+		reap := make(chan bool, 10)
+		sigs := make(chan os.Signal, 10)
+		signal.Notify(sigs, syscall.SIGCHLD)
+		go func() {
+			for sig := range sigs {
+				switch sig {
+				case syscall.SIGCHLD:
+					for {
+						var wstatus syscall.WaitStatus
+						wpid, err := syscall.Wait4(-1, &wstatus, syscall.WNOHANG, nil)
+						if err != nil {
+							log.Log.Reason(err).Errorf("Failed to reap process %d", wpid)
+						}
+						if wpid == 0 {
+							log.Log.Infof("No more processes to be reaped")
+							break
+						}
+						reap <- true
+						log.Log.Infof("Reaped pid %d with status %d", wpid, int(wstatus))
+					}
+
+				default:
+					Panic()
+				}
+			}
+		}()
+
+		cmds := []*exec.Cmd{}
+		for range 10 {
+			cmds = append(cmds, start())
+		}
+
+		for i := range 10 {
+			go func(i int) {
+				Expect(cmds[i].Process.Kill()).To(Succeed())
+			}(i)
+		}
+		Eventually(reap).Should(HaveLen(10))
+
 	})
 
 	Describe("VirtLauncher", func() {

Without the loop, we missed 1-7 signals. With the loop I couldn't reproduce, so I am pretty sure this helps but still not confident that this is completely fixing the issue.

exitStatus <- wstatus.ExitStatus()
for {
var wstatus syscall.WaitStatus
wpid, err := syscall.Wait4(-1, &wstatus, syscall.WNOHANG, nil)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vladikr maybe we can actually run loop that will cleanup children regardless the signal

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on exit, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the time as the syscall should block if nothing happens. But I think as is this is good enough

@qkfrksvl
Copy link

big thanks @xpivarc

@qkfrksvl
Copy link

@xpivarc I had the same issue, but it was a bit more serious. #15711

@xpivarc
Copy link
Member Author

xpivarc commented Sep 23, 2025

@fossedihelm @vladikr PTAL

@fossedihelm
Copy link
Contributor

/lgtm
Big thanks
@vladikr up to you to unhold :)

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Sep 23, 2025
@kubevirt-commenter-bot
Copy link

Required labels detected, running phase 2 presubmits:
/test pull-kubevirt-e2e-k8s-1.31-windows2016
/test pull-kubevirt-e2e-kind-1.33-vgpu
/test pull-kubevirt-e2e-kind-sriov
/test pull-kubevirt-e2e-k8s-1.33-ipv6-sig-network
/test pull-kubevirt-e2e-k8s-1.32-sig-network
/test pull-kubevirt-e2e-k8s-1.32-sig-storage
/test pull-kubevirt-e2e-k8s-1.32-sig-compute
/test pull-kubevirt-e2e-k8s-1.32-sig-operator
/test pull-kubevirt-e2e-k8s-1.33-sig-network
/test pull-kubevirt-e2e-k8s-1.33-sig-storage
/test pull-kubevirt-e2e-k8s-1.33-sig-compute
/test pull-kubevirt-e2e-k8s-1.33-sig-operator

@vladikr
Copy link
Member

vladikr commented Oct 2, 2025

/unhold

@kubevirt-bot kubevirt-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 2, 2025
@xpivarc
Copy link
Member Author

xpivarc commented Oct 2, 2025

/retest-required

@kubevirt-bot
Copy link
Contributor

kubevirt-bot commented Oct 2, 2025

@xpivarc: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubevirt-e2e-k8s-1.33-sig-compute-serial 7fa127f link true /test pull-kubevirt-e2e-k8s-1.33-sig-compute-serial
pull-kubevirt-e2e-k8s-1.33-sig-compute-arm64 72229df link false /test pull-kubevirt-e2e-k8s-1.33-sig-compute-arm64
Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@kubevirt-bot kubevirt-bot merged commit 71fc271 into kubevirt:main Oct 3, 2025
47 checks passed
@xpivarc
Copy link
Member Author

xpivarc commented Oct 3, 2025

/cherry-pick release-1.6 release-1.5 release-1.4 release-1.3 release-1.2 release-1.1 release-1.0

@kubevirt-bot
Copy link
Contributor

@xpivarc: new pull request created: #15816

Details

In response to this:

/cherry-pick release-1.6 release-1.5 release-1.4 release-1.3 release-1.2 release-1.1 release-1.0

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/launcher dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/compute size/S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants