Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Allow using a locally installed Python runtime as a full toolchain #2070

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
rickeylev opened this issue Jul 16, 2024 · 5 comments
Open

Allow using a locally installed Python runtime as a full toolchain #2070

rickeylev opened this issue Jul 16, 2024 · 5 comments

Comments

@rickeylev
Copy link
Collaborator

This feature request has come up a few time from different places.

The basic request is to have an option other than the hermetic runtimes, typically a Python runtime that is locally installed. Some of the reasons this is appealing:

  • Avoids having to download a remote runtime.
  • If you deeply care about supply chain issues, the provenance of the runtimes isn't very compelling.
  • The standalone-python-build runtimes have some known limitations
  • The hermetic runtimes are expensive to setup: they're inbuild runtimes with a (mostly) full python installation, all of which has to be copied into runfiles
  • I think some people like to manage their Python version outside of bazel, e.g. using pyenv or venv activation.

Making a local runtime have the same capabilities as a hermetic runtime is sort of possible today, but requires quite a bit of setup, and I think maybe loading a few private bzl files.

PR #2000 lays the ground work for most of it. What's now missing is a public API for it and figuring out what knobs (if any) it needs.

For at least early testing, we're thinking to have a flag control what type of runtime is selected.

But, during our maintainers meeting, two big questions came up:

  1. What does the public API for using it look like?
  2. What are the use cases? These drive what the public API needs to look like

So, if you have use cases, please post them here. It'll help us figure out what capabilities are needed.

@martis42
Copy link
Contributor

martis42 commented Jul 18, 2024

I have no proposal for a concrete API but some related thoughts.

1 )
The hermetic toolchain should remain the default. It fits best Bazel's focus on reproducible actions and (remote) action caching.

2 )
I believe there should be a knob to decide if third party dependencies can be discovered on the host or not.
I can imagine some users want to use their complete local Python environment (either system installation or venv) for things like rapid prototyping without the overhead of bazelized dependency management.
Others might only want the host interpreter and standard library to work around the known functional issues and the performance penalties. Those users still want Python to only see third party dependencies provided by Bazel.

3 )
If files for the local toolchain are missing there should be a verbose error telling the user what is missing. For example on Ubuntu the python-dev package is required, which might not be obvious to all users. Some glob failing on system paths might be confusing for users. After all they use zhe Python interpreter and it might not be clear to them why the local toolchain stills does not work.
One never knows how the user system is structured. Having some safe guards can help prevent user frustration and bug reports.

@rickeylev
Copy link
Collaborator Author

rickeylev commented Jul 24, 2024

Synopsis of what arrdem on slack told me about their setup:

  • They aren't able to, and have limited interest in, adopting the hermetic runtimes. There are various reasons including issues with packaging, 1st party dependencies, using patched interpreters, and the runfiles setup overhead of the hermetic runtimes.
  • They want incremental Python version upgrades. To do this, they have created toolchain and build flag options to choose what Python is used

The basic way they've set things up is:

  • A repo rule generates py_runtime(), toolchain() et al targets.
  • These are platform runtimes that point to e.g. /usr/local etc.
  • They include C stuff like headers, linkopts etc.
  • They use a custom bootstrap (1 stage, but this is probably just because 2 stage only recently came about)
  • Their bootstrap is just a shim that calls exec python%version% (where the version is populated by the repo rule code). I think they rely on the outer environment, via a venv, to make the e.g. python3.8 lookup via PATH work.

Insofar as this FR is concerned, I think the main things of interest are:

  1. Being able to specify target_compatible_with and target_settings for the toolchain
  2. Possibly being able to skip "auto detection" logic, though, maybe they would actually like that instead of having to hard-code system paths to directories with C headers.

@rickeylev
Copy link
Collaborator Author

@martis42

hermetic remains the default

Yes. Using local toolchains is something users will have to opt-into in some fashion.

knob for discovering 3rd party dependencies from the host

This is outside the scope of this issue. This is just about being able to make a locally installed Python runtime understood by Bazel. Something like this mostly be the purview of the pip.parse/pypi integration, which decides the where and how of 3rd party dependencies. (To a smaller extent, the purview of py_binary, which determines if e.g. isolated mode, disabling site packages, etc is set when the interpreter is invoked).

missing files should have a verbose error

I agree in principle, but we'll see how well it works out in practice.

@martis42
Copy link
Contributor

Insofar as this FR is concerned, I think the main things of interest are:

  1. Being able to specify target_compatible_with and target_settings for the toolchain

I second this point. Even without support for host toolchains, being able to set target_compatible_with, exec_compatible_with and target_settings for the hermetic toolchains without having to do the toolchain definition and registration steps manually would help me immediately in the projects I am working with.

An open design decision is for me replacing vs extending.
I don't wan't to redo the OS and architecture detection and settings. I only want to add on top control via extra custom flags. Others might want full control. Then again full control is always possible by doing the toolchain definition and registration manually.

@jsharpe
Copy link
Member

jsharpe commented Jul 26, 2024

My usecase is fairly straightforward - I want to import a Paraview install and use their pvpython wrapper to be able to use a prebuilt environment that includes the paraview libraries; building paraview from source is complex and porting that to build in Bazel is complex due to the number of dependencies and its heavy reliance on CMake. In my case I just have a few standalone scripts that I need to run and so I'd like to be able to use the hermetic toolchain for my other targets but be able to select this toolchain for a few of my targets.

github-merge-queue bot pushed a commit that referenced this issue Apr 9, 2025
)

This adds docs and public APIs for using a locally installed python for
a toolchain.

Work towards #2070

---------

Co-authored-by: Ignas Anikevicius <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants