Thanks to visit codestin.com
Credit goes to github.com

Skip to content

BLD: Make universal2 wheels #20787

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

BLD: Make universal2 wheels #20787

wants to merge 1 commit into from

Conversation

lithomas1
Copy link
Collaborator

No description provided.

@github-actions github-actions bot added the 36 - Build Build related PR label Jan 11, 2022
@lithomas1 lithomas1 closed this Jan 12, 2022
@lithomas1 lithomas1 reopened this Jan 12, 2022
@lithomas1 lithomas1 closed this Jan 12, 2022
@lithomas1 lithomas1 reopened this Jan 12, 2022
@lithomas1 lithomas1 marked this pull request as ready for review January 13, 2022 23:16
@lithomas1
Copy link
Collaborator Author

I think this works now and is ready for a first pass of review. Depends on #20747.

@lithomas1 lithomas1 marked this pull request as draft January 15, 2022 05:03
@lithomas1 lithomas1 marked this pull request as ready for review January 26, 2022 02:08
@lithomas1 lithomas1 requested review from rgommers and mattip January 26, 2022 03:14
@mattip
Copy link
Member

mattip commented Jan 26, 2022

Is there demand for projects to supply universal2 wheels?

@rgommers
Copy link
Member

Is there demand for projects to supply universal2 wheels?

There is some - the main (only?) argument being that it is useful for py2app users, and other such bundling tools that want to provide downloadable installers which work on any macOS machine (so I guess the target audience there is non-technical users).

A pip install numpy will never select a universal2 wheel.

@mattip
Copy link
Member

mattip commented Jan 26, 2022

Hmm. The py2app documentation does not make it clear how they would bundle a universal2 wheel, and opens with the sentence:

the documentation about universal binaries is outdated!

It seems the support is in place, at least the issue about it was closed saying it is now supported.

I opened ronaldoussoren/py2app#399 to try to get some clarification.

@lithomas1
Copy link
Collaborator Author

ping on this.

I think the main issue is that numpy has already uploaded universal2 wheels and that other projects in the scientific python ecosystem depending on numpy are also providing them.

If I understand correctly, universal2 wheels are sort of an all or nothing kind of thing. If numpy doesn't provide a universal2 wheel, then it is pointless for packages using numpy as a dependency(e.g. pandas) to create a universal2 wheel, because that package will need numpy which doesn't provide a universal2 wheel that runs on both architectures.

@rgommers
Copy link
Member

If I understand correctly, universal2 wheels are sort of an all or nothing kind of thing. If numpy doesn't provide a universal2 wheel, then it is pointless for packages using numpy as a dependency(e.g. pandas) to create a universal2 wheel, because that package will need numpy which doesn't provide a universal2 wheel that runs on both architectures.

Yes, that is very much true.

The use case for universal2 wheels is still extremely thin, I think there is one but so far I have not encountered a single project which actually needs/uses them. On the one hand we could say "let's just do it because there's a request", but on the other hand it would be very nice to see a few projects which actually use NumPy/Pandas/etc. to ship a universal2-based installer. Otherwise we're just adding technical debt and CI jobs for zero real-world usage.

@mattip
Copy link
Member

mattip commented Feb 17, 2022

I tried to ask for more opinions on the scientific python forum.

other projects in the scientific python ecosystem depending on numpy are also providing them.

@lithomas1 which projects? Maybe they could shed more light on why they are providing universal2 wheels?

@matham
Copy link

matham commented Mar 16, 2022

Universal wheels are needed for app developers that package apps using only github CI. E.g. we at Kivy (a python GUI framework) provide our libraries as universal wheels as well as a packaging mechanism to bundle it all into a dmg.

If there are no universal wheels or binaries then you need a M1 to build the app for the M1 and a x86 to build the app for x86. If there are universal wheels then you can build both into one dmg supporting both architectures.

The problem with the former is that currently there's no CI that supports running on M1. E.g. Github's issue about this doesn't seem to be going anywhere. So your options is to use aws, which has a 24hr min usage/payment or e.g. something like scaleway. As a OSS dev making apps for other uses it would be nice if you could just build your apps on the existing github CI for both archs.

@charris
Copy link
Member

charris commented Mar 16, 2022

@matham I'm not quite clear what you are saying. Do you need a universal wheel to make dmg's? We currently build the M1 wheels on azure using a cross compiler (I think). The downside is that we cannot test the result.

@matham
Copy link

matham commented Mar 16, 2022

Do you need a universal wheel to make dmg's?

I'll try to explain the process a little more and issues as I understand it. I have not gone through the process of building an app with our universal dmg myself yet so I could be slightly wrong.

Imagine we're on x86 and want to make a dmg app for x86/arm64. What we do is download universal python, using pip running on the x86 part of the universal binary we install Kivy universal wheels, then we create a bundle from it using Platypus, and finally we package it up as a dmg. A user then downloads the dmg and installs it, on their x86, installs their universal wheels/pure pip packages using again the x86 side of the binaries and packages it up as a dmg to their users.

A user then downloads the "universal dmg" if they are on x86 it runs the x86 part of the binaries, and similarly for arm64.

If there are no universal wheels, it's not clear to me how this process would work. Because when you're on python running pip on x86 it'll install x86 wheels so if you package it up it'll only work on x86. The only workaround is to build a further tool, like python-for-android, that cross installs all the wheels and compiles for the target platform rather than the one you're running on, but that would require a lot of work.

Does that clarify it?

@matham
Copy link

matham commented Mar 16, 2022

Hmm perhaps pip already supports it with the --platform option!?

@misl6
Copy link

misl6 commented Mar 16, 2022

As @matham said universal2 wheels are extremely useful in case of App packaging.

We (as developers) are absolutely comfortable about creating our virtualenv, installing the required dependencies via pip (that chooses the right architecture for us) and then run the app.

Instead, in a real-world scenario, shipping two different .dmg packages, for Intel and Apple Silicon, could be painful for final non-tech users. I’ve seen people on Intel downloading Apple Silicon installers and vice-versa, screaming in front of the “Install Rosetta?” installation prompt (that's absolutely not their fault, they're just not into tech as us).

Universal2 binaries (and so universal2 App packages), are here to remove that kind of friction for the upcoming few years.

So, having universal2 wheels means that the developer can just pip install the dependency of its app it into the (prebuilt, in Kivy's case) App environment and than ship it, with almost zero effort. Instead, not having universal2 wheels, means that the developer have to build on its own a universal2 wheel, or ship the app as two different installers.

Hope that clarifies the need of universal2 wheels, at least for this kind of use-case.

@mattip
Copy link
Member

mattip commented Mar 16, 2022

Thanks for the explanations. Is it common that .dmg apps package numpy? I agree packaging projects should be shipping one .dmg. Do you have examples of such projects?

As I understand things, the lack of support for universal2 wheels on the CI platforms means someone will end up creating a tool to fuse x86_64 and arm64 wheels into a universal2 package. I wonder if the correct place for such a tool is in the upstream project CI (here, in scipy, in scikit-learn, in ...) or in the app packager. Adding universal2 wheels to the scientific python community is "expensive" since the universal2 wheels are large, and will take CI resources to build. If the only consumers are app packagers who need to run app packaging tools anyway, perhaps the right place for such a fusing tool is in the packaging tools themselves, as a common resource they can share.

@rgommers
Copy link
Member

installing the required dependencies via pip (that chooses the right architecture for us)

From previous discussions, I don't think it's possible to install universal2 wheels with pip. I just tried various incantations like:

pip install numpy=1.22.0 --platform x86_64 --platform arm64 --only-binary=:all:

and I indeed can't get it to select the universal2 wheels for 1.22.0. If there is such a way, I'd love to know what it is.

Instead, in a real-world scenario, shipping two different .dmg packages, for Intel and Apple Silicon, could be painful for final non-tech users.

This seems like the one key argument. Leaving aside why PyInstaller/py2app can't glue together two thin wheels (would be easier on everyone and save disk space for end users), the question is: how much are you willing to pay in terms of disk space and dmg size in order to avoid needing to explain to users what their CPU is? For your actual apps, what is their size (dmg and on disk) and are you okay doubling it just for this one convenience @matham and @misl6?

@matham
Copy link

matham commented Mar 17, 2022

Thanks for the explanations. Is it common that .dmg apps package numpy? I agree packaging projects should be shipping one .dmg. Do you have examples of such projects?

I don't really know how common it is. I'd guess that a majority of apps don't use numpy. But personally, my apps are used in research environments so I do use numpy. I imagine as frontends for machine learning applications become more common, if they use Kivy, they would also include numpy. Kivy's packaging project that creates the dmg is here.

From previous discussions, I don't think it's possible to install universal2 wheels with pip. I just tried various incantations like:

Kivy uploads only universal wheels for the python versions where it's available. In that circumstance pip will install it. But, if e.g. arm64/x86 wheels are also on pypi pip will install them. I'm not sure what, or if there's a way to get it to install universal wheels instead. According to this you need to specify the implementation.

how much are you willing to pay in terms of disk space and dmg size

Personally, my apps are running in research environments so space is not a huge issue. My typical app is 130MB as dmg and 445MB uncompressed. A third of it is basically python and the rest is packages. I wouldn't quite expect it to double, especially for the dmg as hopefully there's some shared compression. But, for me that would be ok.

Adding universal2 wheels to the scientific python community is "expensive" since the universal2 wheels are large, and will take CI resources to build. If the only consumers are app packagers who need to run app packaging tools anyway, perhaps the right place for such a fusing tool is in the packaging tools themselves

Thanks @mattip for pointing out the fusing option, this option is only a few months old so we weren't aware of it (I didn't look at the changes in the PR or I would have seen it 😬😬 earlier). Given that this exists, we could add something to the packager that downloads all the deps for arm64/x86 to separate folders and creates universal wheels for those that don't already have it and then installs it manually.

So that does make the impetus less for numpy to have universal wheels. However, given how easy it is to simply fuse them after compiling them separately, is there any downsides to doing that for numpy? At best it'd be a set and forget kind of thing. And at worst, if fusing fails e.g. because the files are different, it's better for numpy to notice it rather than the packaging app who may not be able to do anything about it!?

@mattip
Copy link
Member

mattip commented Mar 17, 2022

The downside to fused universal2 wheels is that it runs in all the repos all the time, but is only needed by app packagers when releasing/testing a new version. This is similar to many discussions we have with enhancements to NumPy: why not add a new dtype, or new random number seed algorithms, or an extension of a linear algebra function, since then they would be tested on a regular basis and bugs caught earlier. Our usual response is that those enhancements are a burden. They require developer attention, CI cycles, storage space and computing resources, all for little practical gain. Many of the whishlist enhancement category are these types of requests. As I understand things, app packagers would still need to have the option to fuse wheels for projects that do not produce universal2 wheels, so I think the proper place for all that to happen is when packaging apps.

Here we are talking about wheels, which are not produced on every CI run, but are produced and uploaded once a week at least. I guess we could have a strategy to periodically test that the fusing works without burdening the already overloaded PyPI system with rarely downloaded universal2 wheels.

@rgommers
Copy link
Member

But, if e.g. arm64/x86 wheels are also on pypi pip will install them. I'm not sure what, or if there's a way to get it to install universal wheels instead. According to this you need to specify the implementation.

I saw that as well and even tried it, before realizing "universal wheels" are not universal2 wheels at all; what is meant there is -none wheels (usually for packages consisting of pure Python code) so they work on any machine.

@lithomas1
Copy link
Collaborator Author

lithomas1 commented Mar 24, 2022

Hi all,

Perhaps a reasonable compromise here would be to provide the universal2 wheels in the Github release(where the sdist and changelog is provided). I agree that providing wheels on PyPI feels sort of wrong, as most people will not try to install a universal2 wheel using pip.

Given the demand so far from people(1 bug report + multiple comments, I don't think a lot of people know about this discussion thread), I'm inclined to provide support for universal2 wheels. While I do share @mattip's concerns about the maintenance burden, fusing wheels doesn't take too much effort(multibuild takes around 1 minute to do this, and most of the time is spent testing the wheel).

Even though fusing wheels does not take too much effort(as in time/computing resources required), it is still not easy to do, requiring third party tools such as multibuild. This makes me a bit worried that not providing wheels could result in more bug reports of broken installs, if packagers aren't fusing the wheels correctly.

(P.S. Should this discussion be moved to the mailing list? It would be nice to get more input on this).

@matham
Copy link

matham commented Mar 24, 2022

I've kinda come around the POV that there's little point on putting universal2 wheels on pypi. I've added multiple PRs to projects to add support for arm64 mac and didn't include universal2 wheels (numexpr, pytables, h5py).

The reason is that, given that using existing universal wheels are all or nothing and since even numpy is hesitating to add them (and it does seem a little like a waste of space on pypi) I can't imagine most projects will be adding them, at least not before x86 macs are EOL. So...if you want to use universal2 wheels for packaging, you'll have to be able to automatically fuse wheels for some projects, but at that point, why not do it for all projects?

So, the only thing that would be helpful is for numpy to test fusing to make sure it's working, otherwise maybe we'll run into issues as we try to fuse them ourselves. And in that respect, adding them to github release probably won't help much if you need universal2 wheels because you'd have some automated stage anyway to fuse so you wouldn't have a numpy specific branch to find it on the release page.

As a side note, pip is indeed difficult and inflexible to work with when it comes to install cross platform packages. You have to specify e.g. --platform macosx_11_0_arm64 --only-binary=:all: --target path including a target path and even if something is already installed there it insists on again downloading all deps of everything you're installing. So any dependency that is distributed purely as a sdist has to be downloaded, manually made into a wheel, and installed in that single command. And if you're making a universal app, you'd need to somehow download everything for x86, figure out what is not pure-python, get the arm64 wheel for the same version and fuse.

So, if you don't know your dependency chain well the automatic wheel creation from sdist can be a little difficult even if you're just cross installing, not using universal2 wheels. This is all to say, if everyone did provide universal2 wheels it would make packaging significantly easier. But, I don't think that's realistic so there's little reason to upload universal2 wheels for most projects.

@rgommers
Copy link
Member

While I do share @mattip's concerns about the maintenance burden, fusing wheels doesn't take too much effort(multibuild takes around 1 minute to do this, and most of the time is spent testing the wheel).

Now that the work is basically done, I think it's okay to run it as part of the wheel builds for now. With the understanding that we reserve the right to stop doing that if it's a burden.

This makes me a bit worried that not providing wheels could result in more bug reports of broken installs, if packagers aren't fusing the wheels correctly.

Yes this is a worry. That said, it also shouldn't fall on packagers of every application - the correct place I think is py2app & co. It's a bit sad that there was a "let's take a shortcut" decision to push universal2 into Python itself, and the few tools that produce macOS apps can't be bothered to do anything and therefore push the maintenance work to maintainers of every Python package with compiled extensions.

(P.S. Should this discussion be moved to the mailing list? It would be nice to get more input on this).

It'd be useful to ping the list with a summary I'd say, but it's probably best to direct people here to avoid a discussion split over more places.

(and it does seem a little like a waste of space on pypi)

The only reason to have them on PyPI is discoverability. Given that it's perhaps more work to exclude the wheels from the automated uploading mechanism (assuming we do produce them) than to treat them like any other wheel, I'm fine uploading to PyPI - the space taken is not that much. But not doing so is also fine, no strong feelings either way.

@rgommers
Copy link
Member

I came across this again when working on the port to Meson as our new build system. A couple of points (partiially notes to self):

  • There now is native M1 CI, which is free for open source projects, from Cirrus CI. SciPy is using this successfully. NumPy should start using that as well.
  • That means regular wheels are being built separately. Fusing them afterwards is the right way to get universal2 wheels (and this PR takes that approach).
    • This means I can get rid of the hardcoded numbers to support universal builds in numpyconfig.h, which were needed only when two arches were built in a single compile.
  • There has been very little demand for universal2 in the past year. So it's fine to not do anything. Fusing can be done anywhere; AFAIK still no one has written the fusing tool to make this automatic.

@rgommers
Copy link
Member

AFAIK still no one has written the fusing tool to make this automati

It seems I was wrong there - it's as simple as

pip install delocate
delocate-fuse $amd64_wheel $arm64_wheel -w .

I think that settles it, this can easily be done in any workflow where folks need it. And it's not specific to numpy, it works for any project. So I think we can close this PR.

@rgommers
Copy link
Member

The more I look at this, the worse universal2 gets. It makes no sense from a design perspective to put this on PyPI, or ship a Python interpreter like that.

Given that:

  • this is going to be very unhelpful for CI and our build system migration,
  • there is very little demand
  • there is delocate-fuse now
  • the burden for this should not fall onto NumPy maintainers or packagers, but rather where it belongs (py2app & co)

Let's close this and make it final that we do not do universal2 wheels.

@rgommers rgommers closed this Nov 17, 2022
@rgommers
Copy link
Member

rgommers commented Nov 17, 2022

Thanks a lot for your effort on this though @lithomas1, and everyone for the inputs and review.

QuLogic added a commit to QuLogic/matplotlib that referenced this pull request Jun 14, 2023
NumPy never built these, and now has a pretty good explanation of why
individual packages should not be doing it, as opposed to app bundling
systems.

See numpy/numpy#20787 (comment)
and following comments.
QuLogic added a commit to QuLogic/matplotlib that referenced this pull request Jun 15, 2023
NumPy never built these, and now has a pretty good explanation of why
individual packages should not be doing it, as opposed to app bundling
systems.

See numpy/numpy#20787 (comment)
and following comments.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
36 - Build Build related PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants