-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
GH-80789: Bundle ensurepip
wheels at build time
#109130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: Sviatoslav Sydorenko <[email protected]>
Co-authored-by: Sviatoslav Sydorenko <[email protected]>
This reverts commit 0aaf135.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to need something for buildbots to avoid breaking them all, but I like the direction here.
# Conflicts: # .cirrus.yml
.cirrus.yml
Outdated
@@ -20,6 +20,8 @@ freebsd_task: | |||
pythoninfo_script: | |||
- cd build | |||
- make pythoninfo | |||
bundle_ensurepip_script: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bundle_ensurepip_script: | |
# Download wheels for the venv step | |
bundle_ensurepip_script: |
Is this the reason for adding it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything using venv or ensurepip during testing needs this provisioning step. If there's no uses, it'd be unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The actual reason to include it is that Cirrus CI failed without this and passed with it. I don't know why Free BSD requires the venv step though, as all other tests pass normally -- only documentation and hypothesis (which use venvs) seem to have an explicit venv step.
A
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AA-Turner It's not because of venv
but ensurepip
itself. The call chain looks as follows:
test.test_tools.test_freeze.TestFreeze.test_freeze_simple_script
calls a CPython build preparation helper @ https://github.com/python/cpython/blob/cbb3a6f/Lib/test/test_tools/test_freeze.py#L28- That, in turn, calls
make install
here https://github.com/python/cpython/blob/cbb3a6f/Tools/freeze/test/freeze.py#L184. make install
callsensurepip
at https://github.com/python/cpython/blob/cbb3a6f/Makefile.pre.in#L1932-L1933 to provision it into the new Python install directory.ensurepip
uses the bundled pip wheel to run bootstrapping so it attempts to unpack it and freaks out when there's no dist for it: https://github.com/python/cpython/blob/cbb3a6f/Lib/ensurepip/__init__.py#L176.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why Free BSD requires
It's not that FreeBSD or Cirrus requires the test explicitly, it's rather that this test is skipped on most platforms via https://github.com/python/cpython/blob/cbb3a6f/Lib/test/test_tools/test_freeze.py#L11-L20.
Reviewed-by: Hugo van Kemenade <[email protected]>
I identified the place needing to be changed in #12791 (comment). @AA-Turner FYI |
I tried to explore a different approach of fully integrating this into A |
I tried doing that in my PR and was getting weird behavior, I wasn't sure why. Maybe I was misusing something in the build machinery or was just confused... If that works, it's probably fine. OTOH, you may want to take into account that many distributions prefer their build envs to be disconnected from the internet so it might still be wise to provide them with means to pre-provision the wheel separately. |
Would #109245 help? |
In Gentoo we aren't actually using the wheels originally bundled with CPython itself but providing the newest versions of the them separately, so as long as that continues working, I suppose that's fine with us. However, it is is paramount that:
However, I can imagine it could be mildly annoying to users who aren't used to the new workflow that having a complete git clone would no longer suffice to actually build CPython offline. |
Same for Fedora. With an addition that we do sometimes keep the bundled wheels for too new or too old Pythons, so having a way to pre-populate the wheels instead of downloaidng them would still be needed. |
@AA-Turner could you resolve the conflicts with this PR? I think it would be good to have this be in Python 3.13's cycle as early as possible, to allow any potential redistributors who care about the details of ensurepip to have time to adjust things as well as provide us feedback on this as early as feasible. |
spec = importlib.util.spec_from_file_location("ensurepip", ENSURE_PIP_INIT) | ||
ensurepip = importlib.util.module_from_spec(spec) | ||
spec.loader.exec_module(ensurepip) | ||
return ensurepip._PROJECTS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AA-Turner I believe this would need to be updated after #109245, and it'll probably let you work with simpler structures.
whl = response.read() | ||
except URLError as exc: | ||
print_error(f"Failed to download {wheel_url!r}: {exc}") | ||
errors = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this meant to be a counter?
errors = 1 | |
errors += 1 |
continue | ||
else: | ||
print_error(f"An invalid '{name}' wheel exists.") | ||
os.remove(wheel_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to use pathlib's method? Seeing that it's already used everywhere..
from urllib.error import URLError | ||
from urllib.request import urlopen | ||
|
||
HOST = 'https://files.pythonhosted.org' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we name this something like a PYPI_CDN_URL
for maintainability?
|
||
def print_notice(message: str) -> None: | ||
if GITHUB_ACTIONS: | ||
print(f"::notice::{message}", end="\n\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this also works if output to stderr FYI.
|
||
try: | ||
projects = _get_projects() | ||
except (AttributeError, TypeError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's an abstraction leak here: it's hard to guess from looking at the _get_projects()
function, what in it might trigger these exceptions. I'd recommend processing them inside that function and making obvious such places where the exceptions may occur. Instead, I'd convert both of the exceptions into something like an ImportError
and handle just that. This would contribute to transparency of how that function works.
return 1 | ||
|
||
errors = 0 | ||
for name, version, checksum in projects: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW with #109245, looping will stop making sense here. So maybe you could start working with just projects[0]
already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, let's keep them separate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was kinda assuming that the simplification PR would get merged first, in which case, this one would have to adapt..
@@ -907,6 +907,10 @@ Build Changes | |||
* Building CPython now requires a compiler with support for the C11 atomic | |||
library, GCC built-in atomic functions, or MSVC interlocked intrinsics. | |||
|
|||
* Wheels for :mod:`ensurepip` are no longer bundled in the CPython source | |||
tree. Distributors should bundle these as part of the build process by | |||
running :file:`Tools/build/bundle_ensurepip_wheels.py`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would probably be clearer:
running :file:`Tools/build/bundle_ensurepip_wheels.py`. | |
running :file:`Tools/build/bundle_ensurepip_wheels.py` with no arguments. |
# ensure the pip wheel exists | ||
pip_filename = os.path.join(test.support.STDLIB_DIR, 'ensurepip', '_bundled', | ||
f'pip-{ensurepip._PIP_VERSION}-py3-none-any.whl') | ||
if not os.path.isfile(pip_filename): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like if it didn't exist, there should be some code to clean it up after testing so that the dummy file doesn't get forgotten on disk. Tests should avoid side effects...
import bundle_ensurepip_wheels as bew | ||
|
||
|
||
# Disable fancy GitHub actions output during the tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be a test for enabled GHA, just for those helpers, then. It'd be sad not to get coverage there.
|
||
|
||
def _is_valid_wheel(content: bytes, *, checksum: str) -> bool: | ||
return checksum == sha256(content, usedforsecurity=False).hexdigest() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AA-Turner I already know that this effort is going to be rejected for now, but out of curiosity — what's the motivation for setting usedforsecurity=False
? The other verification script doesn't have that: https://github.com/python/cpython/blob/3d18034/Tools/build/verify_ensurepip_wheels.py#L81.
Closing this per #80789 (comment). Thanks @AA-Turner for filing this, even though we're not going ahead with this (despite me implying that earlier in the discussion). |
Based on the discussion in #12791, this is a sketch of a different approach (building very heavily on @webknjaz's work).
We add a fairly simple bundler script at
Tools/build/bundle_ensurepip_wheels.py
, with two changes toensurepip
itself to adapt. This PR doesn't attempt to switch to the one-project model, as I found the diff was too large to reasonably review.One question: should we make this part of the default build process (i.e. integration into
make all
/ PCBuild)?A
📚 Documentation preview 📚: https://cpython-previews--109130.org.readthedocs.build/