-
-
Notifications
You must be signed in to change notification settings - Fork 586
fix: fixes to prepare for making bootstrap=script the default for Linux #2760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: fixes to prepare for making bootstrap=script the default for Linux #2760
Conversation
902ca8a
to
0fb8f52
Compare
Hrm. CI flagged an issue relating to the runtime-env toolchain: it doesn't respect the virtual env. This is because the There's another CI failure relating to compile_pip_requirements, but I haven't had a chance to look yet. |
note to self:
|
Ok, so what I've figured out is:
(1) and (2) mean, in order to use py3.9 with runtime-venv toolchain, the only way to make it even see the venv is to create it at runtime with the typical symlink. This would also solve (3) (symlink lib/python3.11 to python3.10; technically wrong, but matches historical behavior); I can think of some alternatives for (3) that might work right now (PYTHONPATH, addsitedir(), or sys.path setup in stage2), but in order to have a normally functioning venv site-packages dir, we have to create lib/pythonX.Y matching the current runtime version. Anyways, what I'm thinking is to add something to the toolchain definition that says "recreate the venv at runtime", and then the runtime env toolchain sets this. We already have a flag for this due to rules_pkg not handling raw symlinks. I think having a "create venv at runtime" thing is gonna be a fact of life for awhile, at least until 3.11 is the minimum supported version. |
To fix the situation when
The logic itself wasn't so bad; much of it already exists because of the zip logic and "don't use declare_symlink" logic. |
Next failure: //tests/multiple_inputs/... Basically, the compile_pip_requirements() rule breaks if...toml files are used? There's 3 tests, 1 pass, 2 fail:
The error seems to stem from a temporary venv (used to run pip compile) being created (by piptools and/or build) from within the py_binary's venv. It's odd that one test passes and others fail, though. Maybe the venv-in-venv thing is a red-herring? Or toml triggers this venv-in-venv thing?
Finally figured it out! The missing "home" key in the pyvenv.cfg file causes Under the hood, compile_pip_requirements uses piptools, which uses the Fixed by having bazel_site_init fixup sys._base_executable. |
299f60f
to
ca41e7d
Compare
ca41e7d
to
dfa4fe9
Compare
dfa4fe9
to
91d1072
Compare
OK, ready for review. I've change scope slightly here: this PR fixes various issues setting bootstrap=script as the default revealed. A separate PR will actually change it to the default. This grew a bit bigger than I anticipated, so quick summary:
|
fi | ||
|
||
mkdir -p "$venv/bin" | ||
ln -s "$python_exe_actual" "$python_exe" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not blocking, just thinking here.
These symlinks are done at runtime, this may be troublesome on read-only environments - for example docker image running with read-only filesystem where only particular directories are writeable, e.g. /tmp
.
I think this might be OK, but somebody will definitely come and say that they want the python from the docker container and I am curious how all of this will go.
We cannot create the symlinks at build time though, because they are not known?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot create the symlinks at build time though, because they are not known?
Correct. When the python being run is coming from PATH at runtime, then we can't know it at build-time.
The next closest approximation of this are local toolchains: they can run which python3
during the repo phase, figure out the absolute path, and have that written as the symlink. This can be host-specific, though (e.g. on my machine it'll resolve to /home/richard/pyvenv/3.11/bin/python3, which won't be in a docker container).
Troublesome for read-only environments
Yeah, unfortunately, what options are available depends on the combination of (1) what python version is used, (2) if the build time and runtime versions match, and (3) if a wrapper script is used a runtime.
In order to have the combination of (1) read-only runtime environment, (2) use the build-time generated venv, and (3) use python from the current runtime environment, then...
At the least, the runtime and build time python versions have to match. Without that, things get dicey -- PYTHONPATH is the only other thing i could come up with, but that's going to pollute subprocesses and change sys.path ordering. So then more hacks to try and workaround that (a second envvar to indicate "undo PYTHONPATH hack" ?)
If Python 3.11+ is used, then the PYTHONEXECUTABLE environment variable will handle things. Stated another way: if you're using python 3.11, changing the runtime env toolchain to have supports_build_time_venv=True should Just Work (and no temp venv need be created). I suppose I could add a flag for that? Or a --runtime_version_matches_build_version type of flag (these flags would be specific to the runtime_env toolchain). Or maybe put a select() on the runtime_env toolchain: set True if --python_version >= 3.11, False otherwise?
For earlier versions, I think so long as the $actual value for exec -a $actual $venv_bin_python3
points to the actual interpreter (/usr/bin/python3
, or whatever sys._base_executable
would tell -- basically a binary that can find its python home), then that also works (I think. Can't remember after so many days of hammering on this).
For this case, making the venv/bin/python3 a wrapper script to handle this (instead of stage1 / runtime_env_interpreter.sh) might work better. We'd might need another setting on the toolchain to know whether to do that? Not entirely sure.
Part of the problem here is switching from non-venv to venv style of execution. Non-venv execution used PYTHONPATH and shoved everything on front, not caring about python version stuff. Venv execution doesn't use PYTHONPATH, but now cares about the python version, since it uses it as part of where it tries to find site-packages.
A saving grace here is using the runtime env toolchain with bootstrap=script didn't work before this PR anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
py3.11, build time venv, runtime-env toolchain Just works
i checked this -- yeah, it just works and doesn't require a writable scratch space. I updated the runtime_env toolchain definition to use a select().
Side note: i had to implement an is_python_at_least helper flag. See config_settings.bzl. Maybe we should factor our a generic "is_python_version_within" flag, and give it args like "gt/gte/lt/lte" ? Seems like that might be useful to the pypi generation stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, regarding factoring out the python flag, +1, We could use env markers for that as well. Could you please create a ticket for that?
Regarding the rest, thanks for the explanation, it's great to have it here.
Co-authored-by: Ignas Anikevicius <[email protected]>
…ckeylev/rules_python into feat.default.bootstrap.script
…snt support the build time venv
…into feat.default.bootstrap.script
current = tuple( | ||
ctx.attr._major_minor[config_common.FeatureFlagInfo].value.split("."), | ||
) | ||
value = "yes" if current >= at_least else "no" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This version matching is a little bit brittle. Might be better to use evaluate
from pep508_evaluate
where we use "python_version >= {}".format(ctx.attr.at_least)
for the marker and then evaluate by passing in the env
.
I think it is fine for now to have your implementation, but at some point it may be nicer to use the standard evaluation.
current = tuple( | ||
ctx.attr._major_minor[config_common.FeatureFlagInfo].value.split("."), | ||
) | ||
value = "yes" if current >= at_least else "no" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM on using yes
and `no.
Various cleanup and prep work to switch bootstrap=script to be the default.
Change
bootstrap_impl
to always be disabled for windows. This allows setting it totrue in a bazelrc without worrying about the target platform. This is done by using
FeatureFlagInfo to force the value to disabled for windows. This allows any downstream
usages of the flag to Just Work and not have to add selects() for windows themselves.
Switch pip_repository_annotations test to
import python.runfiles
. The script bootstrapdoesn't add the runfiles root to sys.path, so
import rules_python
stops working.Switch gazelle workspace to using the runtime-env toolchain. It was previously
implicitly using the deprecated one built into bazel, which doesn't provide various
necessary provider fields.
Make the local toolchain use
sys._base_executable
instead ofsys.executable
when finding the interpreter. Otherwise, it might find a venv interpreter or not
properly handle wrapper scripts like pyenv.
Adds a toolchain attribute/field to indicate if the toolchain supports a build-time
created venv. This is due to the runtime_env toolchain. See PR comments for details,
but in short: if we don't know the python interpreter path and version at
build time, the venv may not properly activate or find site-packages.
If it isn't supported, then the stage1 bootstrap creates a temporary venv, similar
to how the zip case is handled. Unfortunately, this requires invoking Python itself
as part of program startup, but I don't see a way around that -- note this is
only triggered by the runtime-env toolchain.
Make the runtime-env toolchain better support virtualenvs. Because it's a wrapper
that re-invokes Python, Python can't automatically detect its in a venv. Two
tricks are used (
exec -a
and PYTHONEXECUTABLE) to help address this (but theyaren't guaranteed to work, hence the "recreate at runtime" logic).
Fix a subtle issue where
sys._base_executable
isn't set correctly due tohome
missing in the pyvenv.cfg file. This mostly only affected the creation of venvs
from within the bazel-created venv.
Change the bazel site init to always add the build-time created site-packages
(if it exists) as a site directory. This matches the system_python bootstrap
behavior a bit better, which just shoved everything onto sys.path using
PYTHONPATH.
Skip running runtime_env_toolchains tests on RBE. RBE's system python is 3.6,
but the script bootstrap uses 3.9 features. (Running it on RBE is questionable
anyways).
Along the way...
paths. The legacy behavior is disabled in Bazel 8+ by default.
Work towards #2521