Hello!
In systemd we have a test job that runs the whole systemd stack under ASan and UBSan. Recently, LLVM 22 was released and got into Fedora Rawhide and RHEL 10, where I noticed an annoying issue - some services now started intermittently reporting the following warning:
[ 47.524680] systemd-timedated[740]: ==740==WARNING: ptrace appears to be blocked (is seccomp enabled?). LeakSanitizer may hang.
[ 47.524680] systemd-timedated[740]: ==740==Child exited with signal 15.
...
[ 1555.734223] systemd-oomd[93]: ==93==WARNING: ptrace appears to be blocked (is seccomp enabled?). LeakSanitizer may hang.
[ 1555.734223] systemd-oomd[93]: ==93==Child exited with signal 15.
...
This warning comes from this change and I know for sure that it's a false-positive in all of these cases, because we disable any seccomp filters system-wide before running the systemd tests under sanitizers.
After digging a bit deeper into what's going on, the important findings are:
- all affected systemd units are of type notify or notify-reload and are 'exit-on-idle' services (meaning they'll shut themselves down after some period of inactivity)
- the child process forked in LLVM's
TestPTrace() doesn't block signals and it doesn't check for SIGSYS specifically
So, when such systemd unit becomes idle, it starts exiting on its own, unmasks SIGTERM, gets into TestPTrace(), and forks a child process that calls ptrace(). Given the child process is spawned in the same cgroup as the parent process, if the service is explicitly stopped (i.e. by calling systemctl stop ... or during shutdown), systemd sends SIGTERM to both the main PID of the cgroup and all other processes in that cgroup, which kills the "ptrace" child process as well. And given the parent process only checks WIFSIGNALED(wstatus), the SIGTERM is mistaken for an expected SIGSYS, which leads to the misleading warning message. In my opinion, the WIFSIGNALED() check should be extended for an explicit check for SIGSYS as well to avoid this, and the child process should probably block all signals apart from SIGSYS so external signals don't interfere with the check.
Also, this check might not work at all, again in the systemd case, because many systemd units are shipped with following seccomp config:
[Service]
...
SystemCallArchitectures=native
SystemCallErrorNumber=EPERM
SystemCallFilter=@system-service
which means that instead of getting SIGSYS when the ptrace() syscall is blocked, the syscall returns EPERM instead:
# cat test.c
#include <stdio.h>
int main(void) {
puts("Hello world");
return 0;
}
# clang -fsanitize=address test.c -o test
## Just block ptrace() - this is detected properly
# systemd-run --wait --pipe -p "SystemCallFilter=~ptrace" ./test
Running as unit: run-p2390-i2390.service
Hello world
==2392==WARNING: ptrace appears to be blocked (is seccomp enabled?). LeakSanitizer may hang.
==2392==Child exited with signal 31.
## Block ptrace() and return EPERM instead - the TestPTrace() check succeeds
# systemd-run --wait --pipe -p "SystemCallFilter=~ptrace" -p SystemCallErrorNumber=EPERM ./test
Running as unit: run-p2445-i2445.service; invocation ID: d17563a9d2ed4ce68943aff9f904402e
Hello world
==2447==LeakSanitizer has encountered a fatal error.
==2447==HINT: For debugging, try setting environment variable LSAN_OPTIONS=verbosity=1:log_threads=1
==2447==HINT: LeakSanitizer does not work under ptrace (strace, gdb, etc)
(Not sure if this something the TestPTrace() check should support, but it's worth pointing out)
The extra and misleading warning is a bit annoying, especially when you have a post-test task that searches logs for any errors/warnings from sanitizers and fails the test if it finds anything.
Hello!
In systemd we have a test job that runs the whole systemd stack under ASan and UBSan. Recently, LLVM 22 was released and got into Fedora Rawhide and RHEL 10, where I noticed an annoying issue - some services now started intermittently reporting the following warning:
This warning comes from this change and I know for sure that it's a false-positive in all of these cases, because we disable any seccomp filters system-wide before running the systemd tests under sanitizers.
After digging a bit deeper into what's going on, the important findings are:
TestPTrace()doesn't block signals and it doesn't check for SIGSYS specificallySo, when such systemd unit becomes idle, it starts exiting on its own, unmasks SIGTERM, gets into
TestPTrace(), and forks a child process that calls ptrace(). Given the child process is spawned in the same cgroup as the parent process, if the service is explicitly stopped (i.e. by callingsystemctl stop ...or during shutdown), systemd sends SIGTERM to both the main PID of the cgroup and all other processes in that cgroup, which kills the "ptrace" child process as well. And given the parent process only checksWIFSIGNALED(wstatus), the SIGTERM is mistaken for an expected SIGSYS, which leads to the misleading warning message. In my opinion, theWIFSIGNALED()check should be extended for an explicit check for SIGSYS as well to avoid this, and the child process should probably block all signals apart from SIGSYS so external signals don't interfere with the check.Also, this check might not work at all, again in the systemd case, because many systemd units are shipped with following seccomp config:
which means that instead of getting SIGSYS when the ptrace() syscall is blocked, the syscall returns EPERM instead:
# cat test.c(Not sure if this something the
TestPTrace()check should support, but it's worth pointing out)The extra and misleading warning is a bit annoying, especially when you have a post-test task that searches logs for any errors/warnings from sanitizers and fails the test if it finds anything.