Description
I'm currently investigating a SEGV during vips_shutdown
after the libvips 8.10 update. I had no luck reproducing this on a test server, but managed to get a GDB backtrace from a production server.
(weserv:2337727): GLib-ERROR **: 00:01:04.340: file gthread-posix.c: line 1219 (g_system_thread_wait): error 'No such process' during 'pthread_join (pt->system_thread, NULL)'
2020/08/25 00:01:12 [alert] 2773464#0: worker process 2337749 exited on signal 11 (core dumped)
2020/08/25 00:01:12 [alert] 2773464#0: worker process 2337736 exited on signal 11 (core dumped)
2020/08/25 00:01:12 [alert] 2773464#0: worker process 2337710 exited on signal 11 (core dumped)
2020/08/25 00:01:12 [alert] 2773464#0: worker process 2337730 exited on signal 11 (core dumped)
2020/08/25 00:01:12 [alert] 2773464#0: worker process 2337750 exited on signal 11 (core dumped)
2020/08/25 00:01:12 [alert] 2773464#0: worker process 2337748 exited on signal 11 (core dumped)
2020/08/25 00:01:12 [alert] 2773464#0: worker process 2337727 exited on signal 5 (core dumped)
2020/08/25 00:01:12 [alert] 2773464#0: worker process 2337724 exited on signal 11 (core dumped)
2020/08/25 00:01:20 [alert] 2773464#0: worker process 2337725 exited on signal 11 (core dumped)
2020/08/25 00:01:20 [alert] 2773464#0: worker process 2337723 exited on signal 11 (core dumped)
2020/08/25 00:01:20 [alert] 2773464#0: worker process 2337726 exited on signal 11 (core dumped)
2020/08/25 00:01:20 [alert] 2773464#0: worker process 2337728 exited on signal 11 (core dumped)
Core was generated by `nginx: worker pr'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007fe97d1615a2 in __deallocate_stack () at /lib64/libpthread.so.0
#1 0x00007fe97d1636e3 in __pthread_timedjoin_ex () at /lib64/libpthread.so.0
#2 0x00007fe97a5778eb in g_system_thread_wait () at /lib64/libglib-2.0.so.0
#3 0x00007fe97a5591de in g_thread_join () at /lib64/libglib-2.0.so.0
#4 0x00007fe97ac53aee in vips_g_thread_join () at /lib64/libvips.so.42
#5 0x00007fe97ac57123 in vips_shutdown () at /lib64/libvips.so.42
#6 0x00007fe97b7bae9c in __run_exit_handlers () at /lib64/libc.so.6
#7 0x00007fe97b7bafd0 in on_exit () at /lib64/libc.so.6
#8 0x000000000043ad6c in ngx_worker_process_exit (cycle=cycle@entry=0x7fe969307550) at src/os/unix/ngx_process_cycle.c:1024
#9 0x000000000043adef in ngx_worker_process_cycle (cycle=0x7fe969307550, data=<optimized out>) at src/os/unix/ngx_process_cycle.c:734
#10 0x0000000000439593 in ngx_spawn_process (cycle=cycle@entry=0x7fe969307550, proc=proc@entry=0x43ada9 <ngx_worker_process_cycle>, data=data@entry=0x0, name=name@entry=0x4b46ad "worker process", respawn=respawn@entry=-4)
at src/os/unix/ngx_process.c:199
#11 0x000000000043a095 in ngx_start_worker_processes (cycle=cycle@entry=0x7fe969307550, n=6, type=type@entry=-4) at src/os/unix/ngx_process_cycle.c:349
#12 0x000000000043b99f in ngx_master_process_cycle (cycle=0x7fe969307550, cycle@entry=0x7fe96ad3fa10) at src/os/unix/ngx_process_cycle.c:234
#13 0x000000000041559d in main (argc=1, argv=<optimized out>) at src/core/nginx.c:382
Presumably this is due to commit 8840bc8, which spawns the sink_screen
thread during vips_init
. Prior to this commit, this thread was not spawned until a call was made to vips_sink_screen
. As far as I know, we do not use vips_sink_screen
(or vips_cache
) within our codebase.
As an aside,
For the WebAssembly bindings I completely removed the need of this thread with commit kleisauke@9111aaf (which builds upon #1492) since it seems to be used only when the notify_fn
callback is passed to vips_sink_screen
. Although this callback is not used by libvips itself, it may break compatibility with vipsdisp
.