Thanks to visit codestin.com
Credit goes to github.com

Skip to content

BUG: Fixes for building on Cygwin. #16246

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 32 commits into from
Closed

Conversation

DWesl
Copy link
Contributor

@DWesl DWesl commented May 15, 2020

Changes needed to get numpy to compile on Cygwin, then a few changes to try to reduce the number of test failures due to extension module overlap, then marking differences I think are due to floating-point implementation or fork() failures.

Someone should probably check whether the list of ignored floating-point failures is reasonable.
Most of those have been around since 1.16 a while back.

DWesl added 4 commits May 14, 2020 20:19
Leaving it later can lead to some headers getting confused, because
they're included with one set of feature flags, but the headers they
depend on were included with a different set of feature flags.
…rebase.

Cygwin needs each DLL to have a unique address for fork() emulation to
work.  I'm hoping that calling rebase on a complete list of modules
compiled in this session, plus those installed globally, will allow
the test suite to get an accurate result for more tests.
… on cygwin.

Most likely I should be testing for newlib (the C runtime, roughly
takes the place of glibc, I think), not cygwin (the emulation layer on
Windows), but I have no idea how to do that from within python.  If
someone working on embedded systems runs into this issue, this
hopefully gives them some idea where to start.
…win.

See two commits back for an explanation of why fork() fails on cygwin.
Alternately, see the much better explanation at:
https://cygwin.com/cygwin-ug-net/highlights.html#ov-hi-process
with hints on error messages and workarounds at:
https://cygwin.com/faq.html#faq.using.fixing-fork-failures
@DWesl
Copy link
Contributor Author

DWesl commented May 15, 2020

The failure in Travis seems to be because it couldn't install gfortran, which seems unrelated.

@seberg
Copy link
Member

seberg commented May 15, 2020

The header change seems trivial, the distutils change I do not know, maybe @rgommers can have a look at some point.
Some of the test changes look like serious precision issues, I am wondering if those are cases where we have an alternative implementation, but do not blacklist the implementation shipped by cygwin. The blacklists are in numpy/core/src/common/npy_config.h (not sure if/how it would apply).

@embray you seems to have looked at cygwin related things before. If you have time, any input is appreciated!

@mattip
Copy link
Member

mattip commented May 15, 2020

Looking at npy_config.h is a good idea since it may be setting things wrong by blacklisting things that may be OK on a newer mingw. It would be nice if we could record the choices made there in __config__ somehow.

@DWesl
Copy link
Contributor Author

DWesl commented May 15, 2020

To clarify, this is Cygwin, not MinGW. I can try to cross-compile or check wheels from somewhere if necessary, but nothing here is related to MinGW.

As an interesting side note, compiling latest master works just fine, and seems to have fewer failures than this branch. I'm suspecting I did something wrong with the rebase patch.

I'll try to check the problematic floating-point functions manually and add problematic ones to npy_config.h

@embray
Copy link
Contributor

embray commented May 18, 2020

I'm happy to take a look at this, but could you provide a brief overview of what this fixes? I haven't personally had any problems lately with compiling Numpy on Cygwin. Is this fixing any build issues? Otherwise it seems to be mostly concerned with fixing the test suite, and setting a number of tests to xfail. I admit I haven't tried running the full test suite on Cygwin in a while.

@DWesl
Copy link
Contributor Author

DWesl commented May 18, 2020

A few months ago (#14787 (comment), though I think I had noted the problem earlier), numpy wasn't compiling on Cygwin unless I made Python.h the first file to be #included in several files. That problem appears to be gone now. That is the first commit.

I was getting a lot of fork() failures in tests at the time, and thought running rebase as part of f2py would help. I think it might have at first, but I can now run the tests with only a few fork() failures, without any changes to NumPy master, so that problem also appears to be gone now. That is the core of the second commit. I also changed several program names, which make sense when compiling only for Cygwin, but not for NumPy master.

Several tests failed and still fail due to what seemed like tiny floating-point differences, sometimes nonexistent differences as the branch cut tests don't always report which side of the branch they're trying to test. My first approach was to ignore them, as I saw no relevant differences. That is the third commit. Earlier comments in this thread directed me to npy_config.h to mark the failing functions, so that NumPy could use its own implementations. I have started to work through that, but get some test failures on functions I have marked, so I may not be able to get all of the functions that should be there. I will post that work soon.

There were still tests with fork() failures. I marked them as xfail, because I cared more about whether NumPy worked in general rather than whether my environment was amenable to subprocesses.

A question for the maintainers: should I try to include the modified rebase code here? If NumPy is installed through the Cygwin package manager, rebase is called automatically. The rebase code would be primarily useful for people who create many f2py modules or install NumPy on their own.
Another possibility for the latter case is to add a Cygwin section to the Install docs telling people how to do this on their own and describing when it's necessary.

@embray
Copy link
Contributor

embray commented May 18, 2020

@DWesl Out of curiosity what versions of Cygwin and GCC are you currently using?

@DWesl
Copy link
Contributor Author

DWesl commented May 18, 2020

@embray
Cygwin 3.1.4, GCC 9.3.0

Out of curiosity, what AVX-related errors are you seeing? I ran into some a few months ago and reported those in #14787, where other people mentioned possible solutions.

Edit: Just saw #16290, which looks like the same problem, and you already found the solution I ended up using.

DWesl added 7 commits May 18, 2020 16:59
These changes made some sense when compiling only for cygwin.  They do
not belong in NumPy master.
I don't know if the list should be the same as one of the Windows
lists, or if this should instead be a Newlib list.  I know how to make
a Cygwin list, so I did that.  It didn't always work, though.
The generic "this doesn't work" was confusing people.  Specifying what
"this" is and what happed works better.
Mark tests suddenly passing, and also document how I expect them to
fail.
This was actually making the situation worse, I think because I forgot
to include the NumPy dlls themselves.  Removing this made for many
fewer fork() failures when I tried it.

`python3 -m pip show numpy --files | grep dll`
will give a list of dlls, with at least a relative path.  This can be
done in python with subprocess (will require adding pip as a runtime
dependency), but I'm not sure that's in scope here.
seberg pushed a commit that referenced this pull request May 20, 2020
I found that when building the latest master branch on Cygwin, while testing #16246, that thousands of warnings were generated at build time like:

numpy/core/src/npysort/binsearch.c.src: In function ‘binsearch_left_bool’:
numpy/core/src/npysort/binsearch.c.src:82:1: warning: visibility attribute not supported in this configuration; ignored [-Wattributes]
Granted this is just a warning, so I don't think it's a serious issue.

It seems the test that was supposed to check for __attribute__ support was not working as expected. The #pragmas only take effect if I provide a function body--they are ignored for bare declarations. I don't know if that's by intent, or if it's a GCC issue. For reference:

$ gcc --version
gcc (GCC) 7.4.0
These should primarily be on test functions.
charris pushed a commit to charris/numpy that referenced this pull request May 22, 2020
I found that when building the latest master branch on Cygwin, while
testing numpy#16246, that thousands of warnings were generated at build time
like:

numpy/core/src/npysort/binsearch.c.src: In function
‘binsearch_left_bool’: numpy/core/src/npysort/binsearch.c.src:82:1:
warning: visibility attribute not supported in this configuration;
ignored [-Wattributes] Granted this is just a warning, so I don't think
it's a serious issue.

It seems the test that was supposed to check for __attribute__ support
was not working as expected. The #pragmas only take effect if I provide
a function body--they are ignored for bare declarations. I don't know if
that's by intent, or if it's a GCC issue. For reference:

$ gcc --version
gcc (GCC) 7.4.0
@charris
Copy link
Member

charris commented Oct 20, 2020

Should I drop my version and leave the other?

I don't know enough to make that decision.

@DWesl
Copy link
Contributor Author

DWesl commented Oct 20, 2020

For the broader question of how this is working; there were two tests that segfault on Cygwin versions between June and the present; we may want to wait until the fix is in a release before merging into main.

I apparently copied over the preview of Cygwin 3.2.0, so I still have segfaults in the modfl tests. I'm going to wait until tomorrow to rerun those. If anyone has suggestions for making a test not run so I can avoid pytest-forked, or a way to only fork some tests, that would be much appreciated.

I also get:
FAILED numpy/core/tests/test_multiarray.py::TestHashing::test_collections_hashable
which is not new with this PR, but I have no idea how to debug it. (isinstance(np.arange(5), collections.Hashable) is True, but hash(np.arange(5)) raises, I think).

@DWesl
Copy link
Contributor Author

DWesl commented Oct 20, 2020

Should I drop my version and leave the other?

I don't know enough to make that decision.

The bulk of the changes in #17548 seems to be here:
https://github.com/numpy/numpy/pull/17548/files#diff-6fde01624b4d27874d419c0f8aeae3743c4f5e7e1c9f2b039eb453a714d0cbb1
which adds the -ffixed-xmm%d flags to compile lines with old gcc on Windows and Cygwin if targeting AVX512.

The corresponding code in this PR:
https://github.com/numpy/numpy/pull/16246/files#diff-60f61ab7a8d1910d86d9fda2261620314edcae5894d5aaa236b821c7256badd7R249
adds the -ffixed-xmm flags and -fno-asynchronous-unwind-tables to compile lines on 64-bit Cygwin.

Only one of -ffixed-xmm and -fno-asynchronous-unwind-tables are needed to fix the problem, and -ffixed-xmm is probably the right one to pick. #17548 also seems to be more tightly targeted at the problem (processors without AVX512 don't run into this, but Windows probably would), and has definitely been tested for adding the flags to Numpy on its own. I don't think the duplication is bad, but I'm not sure whether it's necessary. I have no idea if there are things setup.py deals with that bypass numpy/core/setup.py, or the other way round. I could also add comments to each point where flags are added pointing at the other place that adds flags, as well as the PRs in case the line numbers change.

@charris
Copy link
Member

charris commented Oct 27, 2020

@DWesl I'll be making a release in a day or two. You should pick what need to be added, if anything, and do it. There may, or may not, be another release in the 1.19 series. Have you tested with current master?

@DWesl
Copy link
Contributor Author

DWesl commented Oct 27, 2020

I dropped my attempt to add the compiler flags, since #17548 probably got tested and I don't think this did.

38 failures on master with python3.6, many due to the ufunc errors fixed/xfailed here, and more due to the polynomial class being confused about unicode vs ascii printing.

15 failures on master with python3.7, mostly due to the ufunc errors fixed or xfailed in this PR. NumPy arrays are still marked as instances of Hashable, and there's a few other errors that aren't NumPy related.

7 failures on cygwin-fixes with python3.7, one due to not running git clean -xdf, one for a problem reported to Cygwin, and one that I can reproduce in python without NumPy, in addition to the isinstance(np.arange(5), collections.Hashable) failure mentioned earlier.

I can't think of anything else I know how to do, unless someone has pointers for fixing the Hashable thing.

@charris charris modified the milestones: 1.19.5 release, 1.20.0 release Dec 4, 2020
@charris charris changed the title ENH: Fixes for building on Cygwin. BUG: Fixes for building on Cygwin. Dec 31, 2020
// int*, int64* should be propertly aligned on ARMv7 to avoid bus error
#if !defined(NPY_STRONG_ALIGNMENT) && defined(__arm__) && !(defined(__aarch64__) || defined(_M_ARM64))
#define NPY_STRONG_ALIGNMENT 1
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seiko2plus @Qiyu8 Thoughts about this. Do you deal with this elsewhere?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, there's already an opened pr for it, see #18065

@charris
Copy link
Member

charris commented Dec 31, 2020

Closing, I have a rebased version that I will push for examination.

@charris
Copy link
Member

charris commented Dec 31, 2020

See #18102 for rebased version.

@charris charris removed the 09 - Backport-Candidate PRs tagged should be backported label Jan 5, 2021
DWesl added a commit to DWesl/numpy that referenced this pull request Feb 1, 2021
This was suggested by @seiko2plus for debugging a segfault in the
tests on Cygwin:
numpy#18102 (comment)

This test passes on Cygwin, and the whole testsuite has only the
failures I expect from running on Cygwin (see numpy#18102 and numpy#16246).
@charris charris removed this from the 1.21.0 release milestone Feb 7, 2021
DWesl added a commit to DWesl/numpy that referenced this pull request Apr 14, 2021
This was suggested by @seiko2plus for debugging a segfault in the
tests on Cygwin:
numpy#18102 (comment)

This test passes on Cygwin, and the whole testsuite has only the
failures I expect from running on Cygwin (see numpy#18102 and numpy#16246).
DWesl added a commit to DWesl/numpy that referenced this pull request Jul 21, 2021
This was suggested by @seiko2plus for debugging a segfault in the
tests on Cygwin:
numpy#18102 (comment)

This test passes on Cygwin, and the whole testsuite has only the
failures I expect from running on Cygwin (see numpy#18102 and numpy#16246).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants