-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
BLD: Make universal2 wheels #20787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BLD: Make universal2 wheels #20787
Conversation
I think this works now and is ready for a first pass of review. Depends on #20747. |
Is there demand for projects to supply universal2 wheels? |
There is some - the main (only?) argument being that it is useful for py2app users, and other such bundling tools that want to provide downloadable installers which work on any macOS machine (so I guess the target audience there is non-technical users). A |
Hmm. The py2app documentation does not make it clear how they would bundle a universal2 wheel, and opens with the sentence:
It seems the support is in place, at least the issue about it was closed saying it is now supported. I opened ronaldoussoren/py2app#399 to try to get some clarification. |
ping on this. I think the main issue is that numpy has already uploaded universal2 wheels and that other projects in the scientific python ecosystem depending on numpy are also providing them. If I understand correctly, universal2 wheels are sort of an all or nothing kind of thing. If numpy doesn't provide a universal2 wheel, then it is pointless for packages using numpy as a dependency(e.g. pandas) to create a universal2 wheel, because that package will need numpy which doesn't provide a universal2 wheel that runs on both architectures. |
Yes, that is very much true. The use case for |
I tried to ask for more opinions on the scientific python forum.
@lithomas1 which projects? Maybe they could shed more light on why they are providing universal2 wheels? |
Universal wheels are needed for app developers that package apps using only github CI. E.g. we at Kivy (a python GUI framework) provide our libraries as universal wheels as well as a packaging mechanism to bundle it all into a dmg. If there are no universal wheels or binaries then you need a M1 to build the app for the M1 and a x86 to build the app for x86. If there are universal wheels then you can build both into one dmg supporting both architectures. The problem with the former is that currently there's no CI that supports running on M1. E.g. Github's issue about this doesn't seem to be going anywhere. So your options is to use aws, which has a 24hr min usage/payment or e.g. something like scaleway. As a OSS dev making apps for other uses it would be nice if you could just build your apps on the existing github CI for both archs. |
@matham I'm not quite clear what you are saying. Do you need a universal wheel to make dmg's? We currently build the M1 wheels on azure using a cross compiler (I think). The downside is that we cannot test the result. |
I'll try to explain the process a little more and issues as I understand it. I have not gone through the process of building an app with our universal dmg myself yet so I could be slightly wrong. Imagine we're on x86 and want to make a dmg app for x86/arm64. What we do is download universal python, using pip running on the x86 part of the universal binary we install Kivy universal wheels, then we create a bundle from it using Platypus, and finally we package it up as a dmg. A user then downloads the dmg and installs it, on their x86, installs their universal wheels/pure pip packages using again the x86 side of the binaries and packages it up as a dmg to their users. A user then downloads the "universal dmg" if they are on x86 it runs the x86 part of the binaries, and similarly for arm64. If there are no universal wheels, it's not clear to me how this process would work. Because when you're on python running pip on x86 it'll install x86 wheels so if you package it up it'll only work on x86. The only workaround is to build a further tool, like python-for-android, that cross installs all the wheels and compiles for the target platform rather than the one you're running on, but that would require a lot of work. Does that clarify it? |
Hmm perhaps pip already supports it with the |
As @matham said We (as developers) are absolutely comfortable about creating our virtualenv, installing the required dependencies via pip (that chooses the right architecture for us) and then run the app. Instead, in a real-world scenario, shipping two different Universal2 binaries (and so universal2 App packages), are here to remove that kind of friction for the upcoming few years. So, having Hope that clarifies the need of |
Thanks for the explanations. Is it common that As I understand things, the lack of support for universal2 wheels on the CI platforms means someone will end up creating a tool to fuse x86_64 and arm64 wheels into a universal2 package. I wonder if the correct place for such a tool is in the upstream project CI (here, in scipy, in scikit-learn, in ...) or in the app packager. Adding universal2 wheels to the scientific python community is "expensive" since the universal2 wheels are large, and will take CI resources to build. If the only consumers are app packagers who need to run app packaging tools anyway, perhaps the right place for such a fusing tool is in the packaging tools themselves, as a common resource they can share. |
From previous discussions, I don't think it's possible to install
and I indeed can't get it to select the
This seems like the one key argument. Leaving aside why PyInstaller/py2app can't glue together two thin wheels (would be easier on everyone and save disk space for end users), the question is: how much are you willing to pay in terms of disk space and dmg size in order to avoid needing to explain to users what their CPU is? For your actual apps, what is their size (dmg and on disk) and are you okay doubling it just for this one convenience @matham and @misl6? |
I don't really know how common it is. I'd guess that a majority of apps don't use numpy. But personally, my apps are used in research environments so I do use numpy. I imagine as frontends for machine learning applications become more common, if they use Kivy, they would also include numpy. Kivy's packaging project that creates the dmg is here.
Kivy uploads only universal wheels for the python versions where it's available. In that circumstance pip will install it. But, if e.g. arm64/x86 wheels are also on pypi pip will install them. I'm not sure what, or if there's a way to get it to install universal wheels instead. According to this you need to specify the implementation.
Personally, my apps are running in research environments so space is not a huge issue. My typical app is 130MB as dmg and 445MB uncompressed. A third of it is basically python and the rest is packages. I wouldn't quite expect it to double, especially for the dmg as hopefully there's some shared compression. But, for me that would be ok.
Thanks @mattip for pointing out the fusing option, this option is only a few months old so we weren't aware of it (I didn't look at the changes in the PR or I would have seen it 😬😬 earlier). Given that this exists, we could add something to the packager that downloads all the deps for arm64/x86 to separate folders and creates universal wheels for those that don't already have it and then installs it manually. So that does make the impetus less for numpy to have universal wheels. However, given how easy it is to simply fuse them after compiling them separately, is there any downsides to doing that for numpy? At best it'd be a set and forget kind of thing. And at worst, if fusing fails e.g. because the files are different, it's better for numpy to notice it rather than the packaging app who may not be able to do anything about it!? |
The downside to fused universal2 wheels is that it runs in all the repos all the time, but is only needed by app packagers when releasing/testing a new version. This is similar to many discussions we have with enhancements to NumPy: why not add a new dtype, or new random number seed algorithms, or an extension of a linear algebra function, since then they would be tested on a regular basis and bugs caught earlier. Our usual response is that those enhancements are a burden. They require developer attention, CI cycles, storage space and computing resources, all for little practical gain. Many of the whishlist enhancement category are these types of requests. As I understand things, app packagers would still need to have the option to fuse wheels for projects that do not produce universal2 wheels, so I think the proper place for all that to happen is when packaging apps. Here we are talking about wheels, which are not produced on every CI run, but are produced and uploaded once a week at least. I guess we could have a strategy to periodically test that the fusing works without burdening the already overloaded PyPI system with rarely downloaded universal2 wheels. |
I saw that as well and even tried it, before realizing "universal wheels" are not |
Hi all, Perhaps a reasonable compromise here would be to provide the universal2 wheels in the Github release(where the sdist and changelog is provided). I agree that providing wheels on PyPI feels sort of wrong, as most people will not try to install a universal2 wheel using pip. Given the demand so far from people(1 bug report + multiple comments, I don't think a lot of people know about this discussion thread), I'm inclined to provide support for universal2 wheels. While I do share @mattip's concerns about the maintenance burden, fusing wheels doesn't take too much effort(multibuild takes around 1 minute to do this, and most of the time is spent testing the wheel). Even though fusing wheels does not take too much effort(as in time/computing resources required), it is still not easy to do, requiring third party tools such as multibuild. This makes me a bit worried that not providing wheels could result in more bug reports of broken installs, if packagers aren't fusing the wheels correctly. (P.S. Should this discussion be moved to the mailing list? It would be nice to get more input on this). |
I've kinda come around the POV that there's little point on putting universal2 wheels on pypi. I've added multiple PRs to projects to add support for arm64 mac and didn't include universal2 wheels (numexpr, pytables, h5py). The reason is that, given that using existing universal wheels are all or nothing and since even numpy is hesitating to add them (and it does seem a little like a waste of space on pypi) I can't imagine most projects will be adding them, at least not before x86 macs are EOL. So...if you want to use universal2 wheels for packaging, you'll have to be able to automatically fuse wheels for some projects, but at that point, why not do it for all projects? So, the only thing that would be helpful is for numpy to test fusing to make sure it's working, otherwise maybe we'll run into issues as we try to fuse them ourselves. And in that respect, adding them to github release probably won't help much if you need universal2 wheels because you'd have some automated stage anyway to fuse so you wouldn't have a numpy specific branch to find it on the release page. As a side note, pip is indeed difficult and inflexible to work with when it comes to install cross platform packages. You have to specify e.g. So, if you don't know your dependency chain well the automatic wheel creation from sdist can be a little difficult even if you're just cross installing, not using universal2 wheels. This is all to say, if everyone did provide universal2 wheels it would make packaging significantly easier. But, I don't think that's realistic so there's little reason to upload universal2 wheels for most projects. |
Now that the work is basically done, I think it's okay to run it as part of the wheel builds for now. With the understanding that we reserve the right to stop doing that if it's a burden.
Yes this is a worry. That said, it also shouldn't fall on packagers of every application - the correct place I think is
It'd be useful to ping the list with a summary I'd say, but it's probably best to direct people here to avoid a discussion split over more places.
The only reason to have them on PyPI is discoverability. Given that it's perhaps more work to exclude the wheels from the automated uploading mechanism (assuming we do produce them) than to treat them like any other wheel, I'm fine uploading to PyPI - the space taken is not that much. But not doing so is also fine, no strong feelings either way. |
I came across this again when working on the port to Meson as our new build system. A couple of points (partiially notes to self):
|
It seems I was wrong there - it's as simple as
I think that settles it, this can easily be done in any workflow where folks need it. And it's not specific to numpy, it works for any project. So I think we can close this PR. |
The more I look at this, the worse Given that:
Let's close this and make it final that we do not do |
Thanks a lot for your effort on this though @lithomas1, and everyone for the inputs and review. |
NumPy never built these, and now has a pretty good explanation of why individual packages should not be doing it, as opposed to app bundling systems. See numpy/numpy#20787 (comment) and following comments.
NumPy never built these, and now has a pretty good explanation of why individual packages should not be doing it, as opposed to app bundling systems. See numpy/numpy#20787 (comment) and following comments.
No description provided.