Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Build failure on Fedora 37 and Debian 12 in GitHub Actions #651

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
JCGoran opened this issue May 26, 2025 · 3 comments
Closed

Build failure on Fedora 37 and Debian 12 in GitHub Actions #651

JCGoran opened this issue May 26, 2025 · 3 comments

Comments

@JCGoran
Copy link

JCGoran commented May 26, 2025

We've recently been experiencing build failures for mpi4py on Fedora 37 and Debian 12 when using GitHub Actions containers. I am actually not sure if this is an issue with limited resources on GitHub Actions (since the issue appears to be OOM-related), or a problem with mpi4py.

The run in question can be found at: https://github.com/neuronsimulator/nrn-build-ci/actions/runs/15250855499
Since the logs are ephemeral, I am attaching them here.

The traceback on Fedora is:

/usr/lib64/openmpi/bin/mpicc -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -D_GNU_SOURCE -fPIC -fwrapv -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -D_GNU_SOURCE -fPIC -fwrapv -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -D_GNU_SOURCE -fPIC -fwrapv -fPIC -DHAVE_DLFCN_H=1 -DHAVE_DLOPEN=1 -Isrc -I/__w/nrn-build-ci/nrn-build-ci/nrn_venv/include -I/usr/include/python3.11 -c src/mpi4py/MPI.c -o build/temp.linux-x86_64-cpython-311/src/mpi4py/MPI.o
      {standard input}: Assembler messages:
      {standard input}:1197285: Warning: end of file not at end of a line; newline inserted
      {standard input}:1198024: Error: unknown pseudo-op: `.lbe2'
      {standard input}: Error: open CFI at the end of file; missing .cfi_endproc directive
      gcc: fatal error: Killed signal terminated program cc1
      compilation terminated.
      error: command '/usr/lib64/openmpi/bin/mpicc' failed with exit code 1
      [end of output]

while on Debian it is:

/usr/bin/mpicc -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -DHAVE_DLFCN_H=1 -DHAVE_DLOPEN=1 -Isrc -I/__w/nrn-build-ci/nrn-build-ci/nrn_venv/include -I/usr/include/python3.11 -c src/mpi4py/MPI.c -o build/temp.linux-x86_64-cpython-311/src/mpi4py/MPI.o
      gcc: fatal error: Killed signal terminated program cc1
      compilation terminated.
      error: command '/usr/bin/mpicc' failed with exit code 1
      [end of output]

The version of mpi4py affected is 4.0.3, though I have not tried using others. The workflow and the scripts used for running the workflow can be found here (I tried to cut down on the noise to provide an MWE).

@dalcinl
Copy link
Member

dalcinl commented May 27, 2025

I don't see how this can be a problem in mpi4py. The only thing I can think off is that the builds are using the recent Cython 3.1, and the released mpi4py was never tested in that Cython version.

Can you try export CFLAGS=-O0 (or env: {CFLAGS: -O0} in YAML) ?

@JCGoran
Copy link
Author

JCGoran commented May 27, 2025

The only thing I can think off is that the builds are using the recent Cython 3.1, and the released mpi4py was never tested in that Cython version.

I don't think it's Cython related, as the original CI uses version 3.0.12 (see for instance the logs of the original CI here).

Can you try export CFLAGS=-O0 (or env: {CFLAGS: -O0} in YAML) ?

Apparently that works (see this run)! Very curious that it happens only on specific versions of Fedora and Debian though. In any case, feel free to close this as there is a workaround for the issue.

@elcorto
Copy link

elcorto commented May 27, 2025

We hit (probably) the same issue regarding OOM while building in a Debian 12 docker container (debian:bookworm-20250520, stable release) with Python 3.11 and gcc 12.2.0. When building with pip install mpi4py==4.0.3, we see long build times (as in ~15 min) and memory consumption of up to 21 GB.

We could solve this by either

  • using a newer image (debian:trixie-20250520, testing release) with Python 3.13 and gcc 14
  • using the stable image, Python 3.11, gcc 12 and saying CFLAGS=-O3 (or -O0, -O1, -O2) as suggested above
  • using the stable image, and build an older version such as 3.1.6

My guess is that the C compiler attempts to perform complex optimizations that gcc 12 can only solve using enormous amounts of memory.

@dalcinl dalcinl closed this as completed May 27, 2025
JCGoran added a commit to neuronsimulator/nrn-build-ci that referenced this issue May 27, 2025
The specific versions of GCC's optimizer are most likely running out of memory
when building mpi4py in the CI, see discussion in
mpi4py/mpi4py#651.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants