Thanks to visit codestin.com
Credit goes to github.com

Skip to content

terminate called without an active exception #61

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bjmt opened this issue Feb 12, 2022 · 13 comments
Closed

terminate called without an active exception #61

bjmt opened this issue Feb 12, 2022 · 13 comments

Comments

@bjmt
Copy link

bjmt commented Feb 12, 2022

Hi,

Thanks for the great package. Recently someone trying out one of my packages (which makes use of RcppThread) had R crash whenever they invoked a function which made use of RcppThread::parallelFor, with only this message: terminate called without an active exception. This error appears to be setup-specific, as I have been able to replicate the error on CentOS 7 (same as what the person who reported this me was using) with gcc 11.1.0, but not any other system I have access to. The following minimal example is enough to trigger the crash:

Rcpp::cppFunction('void func() {
  RcppThread::ProgressBar bar(20, 1);
  RcppThread::parallelFor(0, 20, [&] (int i) {
        std::this_thread::sleep_for(std::chrono::milliseconds(200));
        bar++;
    });
}', depends = "RcppThread", plugins = "cpp11")

func()

Running R with gdb, I can see the following backtrace:

(gdb) bt
#0  0x00007ffff6d47387 in raise () from /lib64/libc.so.6
#1  0x00007ffff6d48a78 in abort () from /lib64/libc.so.6
#2  0x00007ffff0ba64ee in __gnu_cxx::__verbose_terminate_handler() [clone .cold] ()
   from /opt/rh7/gcc/gcc-11.1.0/lib64/libstdc++.so.6
#3  0x00007ffff0bb1856 in __cxxabiv1::__terminate(void (*)()) ()
   from /opt/rh7/gcc/gcc-11.1.0/lib64/libstdc++.so.6
#4  0x00007ffff0bb18c1 in std::terminate() () from /opt/rh7/gcc/gcc-11.1.0/lib64/libstdc++.so.6
#5  0x00007fffe37065b5 in RcppThread::ThreadPool::globalInstance() ()
   from /tmp/RtmpK0p2Ry/sourceCpp-x86_64-pc-linux-gnu-1.0.7/sourcecpp_63b02386216/sourceCpp_2.so
#6  0x00007fffe3700309 in func() ()
   from /tmp/RtmpK0p2Ry/sourceCpp-x86_64-pc-linux-gnu-1.0.7/sourcecpp_63b02386216/sourceCpp_2.so
#7  0x00007fffe37009d0 in sourceCpp_1_func ()
   from /tmp/RtmpK0p2Ry/sourceCpp-x86_64-pc-linux-gnu-1.0.7/sourcecpp_63b02386216/sourceCpp_2.so
#8  0x00007ffff7840194 in R_doDotCall (ofun=<optimized out>, nargs=<optimized out>, cargs=<optimized out>, 
    call=0x167ebf0) at dotcode.c:595
#9  0x00007ffff784078d in do_dotcall (call=0x167ebf0, op=<optimized out>, args=<optimized out>, 
    env=<optimized out>) at dotcode.c:1280
#10 0x00007ffff788fd60 in Rf_eval (e=0x167ebf0, rho=rho@entry=0x158a348) at eval.c:826
#11 0x00007ffff7894438 in Rf_evalList (el=0x15fb090, rho=rho@entry=0x158a348, call=call@entry=0x15fb0c8, 
    n=1, n@entry=0) at eval.c:3054
#12 0x00007ffff788fc09 in Rf_eval (e=0x15fb0c8, rho=rho@entry=0x158a348) at eval.c:817
#13 0x00007ffff7890feb in R_execClosure (call=call@entry=0x158a1c0, newrho=newrho@entry=0x158a348, 
    sysparent=<optimized out>, rho=rho@entry=0x6535d8, arglist=arglist@entry=0x61b5b0, op=op@entry=0x15edb18)
    at eval.c:1888
#14 0x00007ffff7891d71 in Rf_applyClosure (call=call@entry=0x158a1c0, op=op@entry=0x15edb18, 
    arglist=arglist@entry=0x61b5b0, rho=rho@entry=0x6535d8, suppliedvars=<optimized out>) at eval.c:1814
#15 0x00007ffff788f8be in Rf_eval (e=e@entry=0x158a1c0, rho=rho@entry=0x6535d8) at eval.c:846
#16 0x00007ffff78c3842 in Rf_ReplIteration (rho=rho@entry=0x6535d8, savestack=savestack@entry=0, 
    browselevel=browselevel@entry=0, state=state@entry=0x7fffffffb940) at main.c:264
#17 0x00007ffff78c3ba1 in R_ReplConsole (rho=0x6535d8, savestack=0, browselevel=0) at main.c:314
#18 0x00007ffff78c3c3f in run_Rmainloop () at main.c:1113
#19 0x00007ffff78c3c82 in Rf_mainloop () at main.c:1120
#20 0x000000000040076b in main (ac=<optimized out>, av=<optimized out>) at Rmain.c:29

Unfortunately this is a bit beyond me. I'm not sure exactly what could be going wrong.

Walking back the commits, the last working one is 8725ca8; all future commits cause the crash (well except e6ab09d, as I get error during the compilation process).

If you don't have access to a similar system that can recreate the crash then please let me know if there is anything I do to provide more information. Here is my session info:

R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblas-r0.3.3.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_4.0.2
@tnagler
Copy link
Owner

tnagler commented Feb 13, 2022

I created a docker image with CentOS 7 and gcc 11.1.0, but can neither reproduce the error locally nor with github actions.

Is there any other (potentially) useful information about your machines?

Does the error appear also with empty loop body or when manually setting up a ThreadPool and calling ThreadPool::parallelFor()?

@bjmt
Copy link
Author

bjmt commented Feb 14, 2022

Thanks for helping out. Something which I probably should have mentioned is that the user who reported this was running the code on an HPC, and I reproduced it in my institutions cluster. Perhaps that was not a coincidence.

Here's some info about the cpu of the node I'm reproducing these crashes in:

btremblay@node010:~ $ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                32
On-line CPU(s) list:   0-31
Thread(s) per core:    2
Core(s) per socket:    8
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 45
Model name:            Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
Stepping:              7
CPU MHz:               2470.092
CPU max MHz:           2800.0000
CPU min MHz:           1200.0000
BogoMIPS:              3999.93
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              20480K
NUMA node0 CPU(s):     0-7,16-23
NUMA node1 CPU(s):     8-15,24-31
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts spec_ctrl intel_stibp flush_l1

Also, I did as you suggested and tried setting up a ThreadPool. This also crashed. In fact, the act of creating it itself caused the crash, e.g.:

R> Rcpp::cppFunction('void func() {
R... RcppThread::ThreadPool pool(2);
R... }', plugins = "cpp11", depends = "RcppThread")
R> func()
terminate called without an active exception
Aborted
btremblay@node010:~ $ 

Turning off optimization, the backtrace shows a little bit more info, but I'm not sure whether this will be helpful still:

#0  0x00007ffff6d47387 in raise () from /lib64/libc.so.6
#1  0x00007ffff6d48a78 in abort () from /lib64/libc.so.6
#2  0x00007ffff0ba64ee in __gnu_cxx::__verbose_terminate_handler() [clone .cold] ()
   from /opt/rh7/gcc/gcc-11.1.0/lib64/libstdc++.so.6
#3  0x00007ffff0bb1856 in __cxxabiv1::__terminate(void (*)()) ()
   from /opt/rh7/gcc/gcc-11.1.0/lib64/libstdc++.so.6
#4  0x00007ffff0bb18c1 in std::terminate() () from /opt/rh7/gcc/gcc-11.1.0/lib64/libstdc++.so.6
#5  0x00007fffe36f9259 in std::thread::~thread() ()
   from /tmp/Rtmp6LT7GK/sourceCpp-x86_64-pc-linux-gnu-1.0.7/sourcecpp_17e519ddc149/sourceCpp_2.so
#6  0x00007fffe3702f5c in void std::_Destroy<std::thread>(std::thread*) ()
   from /tmp/Rtmp6LT7GK/sourceCpp-x86_64-pc-linux-gnu-1.0.7/sourcecpp_17e519ddc149/sourceCpp_2.so
#7  0x00007fffe370276f in void std::_Destroy_aux<false>::__destroy<std::thread*>(std::thread*, std::thread*) ()
   from /tmp/Rtmp6LT7GK/sourceCpp-x86_64-pc-linux-gnu-1.0.7/sourcecpp_17e519ddc149/sourceCpp_2.so
#8  0x00007fffe3701cbd in void std::_Destroy<std::thread*>(std::thread*, std::thread*) ()
   from /tmp/Rtmp6LT7GK/sourceCpp-x86_64-pc-linux-gnu-1.0.7/sourcecpp_17e519ddc149/sourceCpp_2.so
#9  0x00007fffe3700859 in void std::_Destroy<std::thread*, std::thread>(std::thread*, std::thread*, std::allocator<std::thread>&) () from /tmp/Rtmp6LT7GK/sourceCpp-x86_64-pc-linux-gnu-1.0.7/sourcecpp_17e519ddc149/sourceCpp_2.so
#10 0x00007fffe36fee07 in std::vector<std::thread, std::allocator<std::thread> >::~vector() ()
   from /tmp/Rtmp6LT7GK/sourceCpp-x86_64-pc-linux-gnu-1.0.7/sourcecpp_17e519ddc149/sourceCpp_2.so
#11 0x00007fffe36fa584 in quickpool::ThreadPool::ThreadPool(unsigned long) ()
   from /tmp/Rtmp6LT7GK/sourceCpp-x86_64-pc-linux-gnu-1.0.7/sourcecpp_17e519ddc149/sourceCpp_2.so
#12 0x00007fffe36fabe9 in RcppThread::ThreadPool::ThreadPool(unsigned long) ()
   from /tmp/Rtmp6LT7GK/sourceCpp-x86_64-pc-linux-gnu-1.0.7/sourcecpp_17e519ddc149/sourceCpp_2.so
#13 0x00007fffe36f8bd1 in func() ()
   from /tmp/Rtmp6LT7GK/sourceCpp-x86_64-pc-linux-gnu-1.0.7/sourcecpp_17e519ddc149/sourceCpp_2.so
#14 0x00007fffe36f8c6a in sourceCpp_1_func ()
   from /tmp/Rtmp6LT7GK/sourceCpp-x86_64-pc-linux-gnu-1.0.7/sourcecpp_17e519ddc149/sourceCpp_2.so
#15 0x00007ffff7840194 in R_doDotCall (ofun=<optimized out>, nargs=<optimized out>, cargs=<optimized out>, 
    call=0x169f318) at dotcode.c:595
#16 0x00007ffff784078d in do_dotcall (call=0x169f318, op=<optimized out>, args=<optimized out>, 
    env=<optimized out>) at dotcode.c:1280
#17 0x00007ffff788fd60 in Rf_eval (e=0x169f318, rho=rho@entry=0x15c9978) at eval.c:826
#18 0x00007ffff7894438 in Rf_evalList (el=0x1618600, rho=rho@entry=0x15c9978, call=call@entry=0x1618638, n=1, 
    n@entry=0) at eval.c:3054
#19 0x00007ffff788fc09 in Rf_eval (e=0x1618638, rho=rho@entry=0x15c9978) at eval.c:817
#20 0x00007ffff7890feb in R_execClosure (call=call@entry=0x15c97f0, newrho=newrho@entry=0x15c9978, 
    sysparent=<optimized out>, rho=rho@entry=0x6535d8, arglist=arglist@entry=0x61b5b0, op=op@entry=0x1609e68)
    at eval.c:1888
#21 0x00007ffff7891d71 in Rf_applyClosure (call=call@entry=0x15c97f0, op=op@entry=0x1609e68, 
    arglist=arglist@entry=0x61b5b0, rho=rho@entry=0x6535d8, suppliedvars=<optimized out>) at eval.c:1814
#22 0x00007ffff788f8be in Rf_eval (e=e@entry=0x15c97f0, rho=rho@entry=0x6535d8) at eval.c:846
#23 0x00007ffff78c3842 in Rf_ReplIteration (rho=rho@entry=0x6535d8, savestack=savestack@entry=0, 
    browselevel=browselevel@entry=0, state=state@entry=0x7fffffffb950) at main.c:264
#24 0x00007ffff78c3ba1 in R_ReplConsole (rho=0x6535d8, savestack=0, browselevel=0) at main.c:314
#25 0x00007ffff78c3c3f in run_Rmainloop () at main.c:1113
#26 0x00007ffff78c3c82 in Rf_mainloop () at main.c:1120
#27 0x000000000040076b in main (ac=<optimized out>, av=<optimized out>) at Rmain.c:29

Thanks again for helping. Please let me know what else I can provide/do.

@tnagler
Copy link
Owner

tnagler commented Feb 15, 2022

OK thanks! I have a suspicion, could you try with remotes::install_github("tnagler/RcppThread", ref = "no-affinity")?

@bjmt
Copy link
Author

bjmt commented Feb 15, 2022

That did it! I can confirm both the ThreadPool and parallelFor examples now work.

@tnagler
Copy link
Owner

tnagler commented Feb 16, 2022

OK so the issue that RcppThread now sets thread affinity on Linux by default. That means that each thread in the pool gets pinned to a logical core to avoid scheduling overhead. This seems to fail on the clusters that you're working on.

I can of course disable this again, but I'm still thinking about better ways to do this. Just to rule out a very simple problem, could you report the results of

Rcpp::cppFunction('void func() {
  Rcpp::Rcout <<
    "std::thread::hardware_concurrency(): " <<
    std::thread::hardware_concurrency() <<
    std::endl;
    cpu_set_t cpuset;
    Rcpp::Rcout << "sizeof(cpuset) / 32: " << sizeof(cpuset) / 32 << std::endl;
}', depends = "RcppThread", plugins = "cpp11")()

@bjmt
Copy link
Author

bjmt commented Feb 17, 2022

Certainly.

std::thread::hardware_concurrency(): 32
sizeof(cpuset) / 32: 4

@tnagler
Copy link
Owner

tnagler commented Feb 17, 2022

OK there's a chance that we're trying to pin threads to cores that are not actually made available to us by the OS. If that's the case, I might have a workaround. Could you install the new version from

remotes::install_github("tnagler/RcppThread", ref = "no-affinity")

and try again?

@bjmt
Copy link
Author

bjmt commented Feb 18, 2022

Everything seems to be working, no crashes!

@tnagler
Copy link
Owner

tnagler commented Feb 18, 2022

That's great, thanks for all the help!

An updated version will go on CRAN some time this month.

@bjmt
Copy link
Author

bjmt commented Feb 18, 2022

Fantastic, thanks a lot! And sorry for not mentioning that HPC/cluster information sooner, hopefully I didn't waste too much of your time chasing down a non-existent CentOS bug.

@tnagler
Copy link
Owner

tnagler commented Feb 18, 2022

No worries, I'm very happy that this turned out to be much easier to find/solve than I feared :) And now I also have a testing setup for CentOS, which can't hurt.

@tnagler
Copy link
Owner

tnagler commented Feb 18, 2022

fixed by #62

@tnagler tnagler closed this as completed Feb 18, 2022
@tnagler
Copy link
Owner

tnagler commented Feb 28, 2022

Updated version is on its way to CRAN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants