Thanks to visit codestin.com
Credit goes to github.com

Skip to content

BUG: Use 2GiB chunking code for fwrite() on mingw32/64 #23505

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 17, 2023

Conversation

cbrt64
Copy link
Contributor

@cbrt64 cbrt64 commented Mar 31, 2023

Addresses #2256, and fixes msys2/MINGW-packages#15856; tested on Win10 19045.2604 (22H2, x86_64). Includes a typo fix and some added detail to the "orphaned" suite test.

I included the check for __MINGW32__ for consistency with the corresponding check in npy_common.h. I currently have no way of testing this fix on x86, but I did consider the possibility of 32-bit software - running on some editions of Windows - being able to allocate a full 4GiB or more, and who knows what MS's x86 fwrite() would do with that.

Numpy has more detailed guidelines than projects I've contributed to before, please tell me if I missed anything.

#if defined (_MSC_VER) && defined(_WIN64)
/* Workaround Win64 fwrite() bug. Issue gh-2556
#if defined(_MSC_VER) && defined(_WIN64) || \
defined(__MINGW32__) || defined(__MINGW64__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: The precedence of the && seems seems a bit unclear.

Does __MINGW32__ have pointers larger than 32bit? Because if not the array cannot be that large anyway (even if unsigned pointers are supported and ptrdiff_t would be larger).

Should we maybe only use NPY_OS_WIN64 defined to elif defined(_WIN64) || defined(__WIN64__) || defined(WIN64) (not 32bit windows), or can we assume many compilers other than MSVC and mingw will work around system limitations?

(Just asking if we can simplify it and at the same time make it better; overall I think we could also just put it in.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cbrt64 do you have time for a quick follow-up on this? Also happy if you just think we should put it in as is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your patience, I'm just a hobbyist poking at random bug reports in my spare time, learning as I go.

This patch breaks 32-bit. You may have seen it was accepted downstream; well, I installed the new mingw32 version of numpy and tested it, and found it failed to write an array of bytes of any size, due to maxsize overflowing __INT_MAX__; see also here. The mingw32 numpy previous to my patch works fine.

Does __MINGW32__ have pointers larger than 32bit?

No; after grokking this subject, I don't believe there's a way for any 32-bit Windows process to access > 4 contiguous GiB at any one time, even using AWE. That doesn't by itself rule out chunking on 32-bit, though; rebuilding with either chunks strictly smaller than 2GiB, or changing npy_intp maxsize etc. to npy_uintp, addresses the overflow mentioned before (I tested npy_uintp on mingw32, and smaller chunks on mingw32/64, clang32/64, and ucrt64).

Incidentally, I found mention from years ago about possibly decreasing write chunk size to e.g. 256MiB, so maybe that's an option? Keeping in mind the possibility of unnecessary file fragmentation.

Should we maybe only use NPY_OS_WIN64

The fwrite in (at least amd64) msvcrt.dll is the one that's broken, but MSYS2/Cygwin's as well as UCRT's fwrites work as expected; though they're not hurt by breaking up writes into smaller chunks. So the question is, which systems should this code apply to? Some options I can see for write chunking on Windows are:

  1. across the board (NPY_OS_CYGWIN || NPY_OS_WIN32 || NPY_OS_WIN64), with either npy_uintp, or chunks < 2GiB to accommodate 32-bit
  2. broken mingw64/msvcrt only (NPY_OS_WIN64 && !_UCRT) (@Biswa96 advised not to depend on __MSVCRT__)
  3. 64-bit only, including UCRT, but not Cygwin (NPY_OS_WIN64)

Should the chunks be made smaller in any case?

I'm inclined to let the maintainers decide what specifically to do regarding this PR. I'm OK with either force-pushing an update implementing your recommendations, or someone taking over.

@cgohlke, in case you have any input.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other things:

AFAICT NPY_OS_WIN64 and NPY_OS_MINGW in npy_os.h aren't ever defined, since _WIN32 is always defined whenever _WIN64 or __MINGW32__ are, respectively.

Rearrange exit condition on incomplete write?

NIT: The precedence of the && seems seems a bit unclear.

I wondered about that too, I had merely borrowed it from npy_common.h.

Copy link
Member

@seberg seberg Apr 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, all of these choices seem good, so I expect we shouldn't overthink it :).

My first thought is to just change the calculation to the unsigned size_t, throughout. That looks full safe and we use size_t anyway. (uintp is in practice the same and maybe exactly the same in the future, I just slightly prefer size_t in the context.)

If there is no super canonical way to find that MSVCRT is used, maybe just keep it simple with NPY_OS_WIN64? And add a comment that chunking is actually only necessary if the msvcrt is used, which e.g. cygwin does not?

EDIT:

#2931 (comment) exit condition on incomplete write?

Oh, I sounds that is a small forever bug. Might as well fix it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cbrt64 I just pushed that version since we should get something in. Could you review it (and maybe also try)? (Not that it matters per-se, but 1.25.0 is approaching and it may be nice to include this.)

@seberg seberg added this to the 1.25.0 release milestone Apr 19, 2023
This is a bit too broad because msys UCRT runtime may actually not
need it.  Changes the type in the loop to `size_t`.  This is not necessary
but the current constants are buggy without it if the branch is accidentally
used on 32bit.
@mattip mattip merged commit d9b38d6 into numpy:main May 17, 2023
@mattip
Copy link
Member

mattip commented May 17, 2023

Thanks @seberg

@seberg
Copy link
Member

seberg commented May 17, 2023

@cbrt64 this is not marked for backport right now. If you need that (there may be another 1.24 release), please give a ping.

@cbrt64
Copy link
Contributor Author

cbrt64 commented May 25, 2023

Sorry again for the delay on my end. Your fix is decidedly simpler than the one I attempted, which started with refactoring to a for loop so I could decipher how it even worked.

For future reference: after local testing against 1.24 on all five of MSYS2's MINGW flavors, I can confirm the code itself works, either:

  1. as is, with your size_t modification, or
  2. as it was before this PR, with npy_intp, but with the constant maximum chunk byte count strictly <= __INT_MAX__ when used on 32-bit systems (if anyone ever wanted to do that).

However, the current #ifdef will never actually allow the chunk code be compiled, since as I pointed out above, NPY_OS_WIN64 never gets defined. See #23806.

As for backporting: The downstream MSYS2 patch will do the job until it's not needed. Others it might affect: users of MSVC who link against MSVCRT (if that's a thing anymore), and some Linux distros having cross-compiled MINGW packages. But I have no experience with either of those.

@cbrt64 cbrt64 deleted the fix-2256 branch January 13, 2025 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

Successfully merging this pull request may close these issues.

[python-tifffile] unable to write bigtiff file (hangs at 4GB)
3 participants