-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
BUG: Use 2GiB chunking code for fwrite() on mingw32/64 #23505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
numpy/core/src/multiarray/convert.c
Outdated
#if defined (_MSC_VER) && defined(_WIN64) | ||
/* Workaround Win64 fwrite() bug. Issue gh-2556 | ||
#if defined(_MSC_VER) && defined(_WIN64) || \ | ||
defined(__MINGW32__) || defined(__MINGW64__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: The precedence of the &&
seems seems a bit unclear.
Does __MINGW32__
have pointers larger than 32bit? Because if not the array cannot be that large anyway (even if unsigned pointers are supported and ptrdiff_t
would be larger).
Should we maybe only use NPY_OS_WIN64
defined to elif defined(_WIN64) || defined(__WIN64__) || defined(WIN64)
(not 32bit windows), or can we assume many compilers other than MSVC and mingw will work around system limitations?
(Just asking if we can simplify it and at the same time make it better; overall I think we could also just put it in.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cbrt64 do you have time for a quick follow-up on this? Also happy if you just think we should put it in as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your patience, I'm just a hobbyist poking at random bug reports in my spare time, learning as I go.
This patch breaks 32-bit. You may have seen it was accepted downstream; well, I installed the new mingw32 version of numpy and tested it, and found it failed to write an array of bytes of any size, due to maxsize
overflowing __INT_MAX__
; see also here. The mingw32 numpy previous to my patch works fine.
Does
__MINGW32__
have pointers larger than 32bit?
No; after grokking this subject, I don't believe there's a way for any 32-bit Windows process to access > 4 contiguous GiB at any one time, even using AWE. That doesn't by itself rule out chunking on 32-bit, though; rebuilding with either chunks strictly smaller than 2GiB, or changing npy_intp maxsize
etc. to npy_uintp
, addresses the overflow mentioned before (I tested npy_uintp
on mingw32, and smaller chunks on mingw32/64, clang32/64, and ucrt64).
Incidentally, I found mention from years ago about possibly decreasing write chunk size to e.g. 256MiB, so maybe that's an option? Keeping in mind the possibility of unnecessary file fragmentation.
Should we maybe only use NPY_OS_WIN64
The fwrite in (at least amd64) msvcrt.dll is the one that's broken, but MSYS2/Cygwin's as well as UCRT's fwrites work as expected; though they're not hurt by breaking up writes into smaller chunks. So the question is, which systems should this code apply to? Some options I can see for write chunking on Windows are:
- across the board (
NPY_OS_CYGWIN || NPY_OS_WIN32 || NPY_OS_WIN64
), with eithernpy_uintp
, or chunks < 2GiB to accommodate 32-bit - broken mingw64/msvcrt only (
NPY_OS_WIN64 && !_UCRT
) (@Biswa96 advised not to depend on__MSVCRT__
) - 64-bit only, including UCRT, but not Cygwin (
NPY_OS_WIN64
)
Should the chunks be made smaller in any case?
I'm inclined to let the maintainers decide what specifically to do regarding this PR. I'm OK with either force-pushing an update implementing your recommendations, or someone taking over.
@cgohlke, in case you have any input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other things:
AFAICT NPY_OS_WIN64
and NPY_OS_MINGW
in npy_os.h aren't ever defined, since _WIN32
is always defined whenever _WIN64
or __MINGW32__
are, respectively.
Rearrange exit condition on incomplete write?
NIT: The precedence of the
&&
seems seems a bit unclear.
I wondered about that too, I had merely borrowed it from npy_common.h.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, all of these choices seem good, so I expect we shouldn't overthink it :).
My first thought is to just change the calculation to the unsigned size_t
, throughout. That looks full safe and we use size_t
anyway. (uintp
is in practice the same and maybe exactly the same in the future, I just slightly prefer size_t
in the context.)
If there is no super canonical way to find that MSVCRT is used, maybe just keep it simple with NPY_OS_WIN64
? And add a comment that chunking is actually only necessary if the msvcrt
is used, which e.g. cygwin does not?
EDIT:
#2931 (comment) exit condition on incomplete write?
Oh, I sounds that is a small forever bug. Might as well fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cbrt64 I just pushed that version since we should get something in. Could you review it (and maybe also try)? (Not that it matters per-se, but 1.25.0 is approaching and it may be nice to include this.)
This is a bit too broad because msys UCRT runtime may actually not need it. Changes the type in the loop to `size_t`. This is not necessary but the current constants are buggy without it if the branch is accidentally used on 32bit.
Thanks @seberg |
@cbrt64 this is not marked for backport right now. If you need that (there may be another 1.24 release), please give a ping. |
Sorry again for the delay on my end. Your fix is decidedly simpler than the one I attempted, which started with refactoring to a For future reference: after local testing against 1.24 on all five of MSYS2's MINGW flavors, I can confirm the code itself works, either:
However, the current As for backporting: The downstream MSYS2 patch will do the job until it's not needed. Others it might affect: users of MSVC who link against MSVCRT (if that's a thing anymore), and some Linux distros having cross-compiled MINGW packages. But I have no experience with either of those. |
Addresses #2256, and fixes msys2/MINGW-packages#15856; tested on Win10 19045.2604 (22H2, x86_64). Includes a typo fix and some added detail to the "orphaned" suite test.
I included the check for
__MINGW32__
for consistency with the corresponding check in npy_common.h. I currently have no way of testing this fix on x86, but I did consider the possibility of 32-bit software - running on some editions of Windows - being able to allocate a full 4GiB or more, and who knows what MS's x86fwrite()
would do with that.Numpy has more detailed guidelines than projects I've contributed to before, please tell me if I missed anything.