-
-
Notifications
You must be signed in to change notification settings - Fork 32k
gh-109649: Add affinity parameter to os.cpu_count() #109652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
Return the number of CPUs in the system. Returns ``None`` if undetermined. | ||
Return the number of logical CPUs in the system. Returns ``None`` if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to update the documentation that it will return the number of logical CPUs for the process if the usable is True.
Because the current os.cpu_count returns the available CPU counts from the system not the process.
It's different layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The doc says:
If affinity is true, return the number of logical CPUs the current process can use.
It's not clear enough? Do you want to propose a different phrasing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is reasonable.
I'd add "this may be less than the number of logical cpus returned by affinity=False due to OS or container limitations imposed upon the process" to make it more clear why people should want to use the affinity=True
argument.
PS thanks for making it keyword only!
I do wish this API never used the term "cpu"... everything these days is really a "logical_core" and what that even means depends a lot on underlying infrastructure and platform that Python may not be able to introspect. Way too late for that though. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My PR adds "logical CPU" to the doc. In previous bug reports, I saw some confusion between physical CPU core, CPU packages, CPU threads, and logical CPUs.
665dc28
to
6872c18
Compare
6872c18
to
38ba1c3
Compare
I updated my PR:
|
I don't think that my Windows implementation works with more than 64 CPUs :-( It should be fixed before this PR can be merged. |
38ba1c3
to
ae114ac
Compare
Manual test on Windows, import os
with open("output", "w") as fp:
try:
print(os.cpu_count(affinity=True), file=fp)
except Exception as exc:
print(f"ERROR: {exc!r}", file=fp) Without CPU affinity (all CPUs):
With CPU affinity (limit to 1 CPU):
|
cc @taleinat |
I mark the PR as a draft to remind myself that either I should drop Windows support for now, or I should fix Windows support, before this PR can be considered ready to be reviewed (and then merged). As I wrote previous, my current implementation is limited to 64 CPUs on Windows which looks wrong. A Windows machine can have more than 64 CPUs: #67226 But I'm not sure if a process can be assigned to more than 64 CPUs. Well, I have to investigate :-) |
|
||
Linux control groups, *cgroups*, are not taken in account to get the number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like people wanting such an API may also want cgroups to also be considered when the real question being answered is really "How parallel am I usefully allowed to be?".
Does that need to be separated out into its own cgroups_cpuset=True
flag so that people could query one or the other or both? The use cases I have in mind are all around the above question where I'd always want the combination aka min(logical_cpus, affinity_cores, cgroups_cpuset_cores)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My PR just fix the documentation to avoid any misunderstading.
I feel like people wanting such an API may also want cgroups to also be considered
It is discussed in PR #80235. So far, nobody proposes any PR to implement this.
Maybe this PR is a baby step forward :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if we decide to support cgroup
in the future, I would like to propose not to use the flag name that can represent the implementation detail. If some platform suggests new things do we have to add a new flag for them?
Implement cpu_count(affinity=True) with sched_getaffinity(). Changes: * Fix test_posix.test_sched_getaffinity(): restore the old CPU mask when the test completes! * Doc: Specify that os.cpu_count() counts *logicial* CPUs. Mention that Linux cgroups are ignored.
ae114ac
to
2762144
Compare
I updated my PR. It's now ready for review. Changes:
I chose the easy way: remove the Windows implementation for now. |
Honestly, IMO cpu_count(affinity=True) is what most users expect by default: decide how many workers (threads or CPUs) should be spawn to maximize efficiency without killing performances. If a server has 100 CPUs but Python is limited by the admin to 2 CPUS, spawning 100 worker processes is likely to kill latency and may cause many timeout issues. Problem: changing the default behavior is wrong for different reasons. For example, a program can simply query cpu_count() to display how many logical CPUs a machine has. It doesn't matter if this program is limited to 1 CPUs or has access to all 100 CPUs of the server. It should always display 100. PYTHONCPUCOUNT env var and -X cpu_count cmdline option sounds complementary with my change: If we can use PYTHONCPUCOUNT=affinity / -X cpu_count=affinity, it would fit into the first use case (spawn worker processes). It will break the second use case which will have to modify their code to use cpu_count(affinity=False). But it's different: someone sets the env var knowing consequences. |
No objection, it will need to people who get cpu_count with affinity option :) |
Same, lets do this. I'm even okay with an envvar + -X option to override the return value after our discussions (followon PR i'd assume). And if Windows only supports a return value of up to 64 for the affinity feature (that came up in one of these discussions or PRs iirc?), just document the caveat while we get guidance from windows experts on how to get a better answer there. |
I marked again my PR as a draft after a discussion with @corona10. I'm sorry about this back and forth. It seems like there is a misunderstanding about "system CPU count" and "process CPU count". Depending on the use case, you may pick one or the other. The problem is that if you consider So. I created PR gh-109907 which adds a new os.process_cpu_count(). The os.cpu_count() stays unchanged: it sticks to its documentation, it returns the number of logical CPUs of the system.
I understood that @corona10 is unhappy about this proposition because he wants to overriden the CPU count of Python applications which currently use Try import os
ncpu = os.environ.get('PYTHONCPUCOUNT', None)
if ncpu:
try:
ncpu = int(ncpu)
except ValueError:
print(f"WARNING: invalid PYTHONCPUCOUNT value: {ncpu!r}")
else:
def cpu_count():
return ncpu
cpu_count.__doc__ = os.cpu_count.__doc__
os.cpu_count = cpu_count Example:
Then put your |
No, it doesn't solve the issue what I reported, How you can inject the customer's sys.path that is already written Docker image file? User will just pull the docker image from the already stored docker container image store. From the K8S admin there is no way to control CPU count except cmdline or environment variable. |
When a -X or equivalent environment variable is encountered, override all return values from |
Realize that no matter what, these new features don't solve anyones existing problems because they're having those problems on Python 3.12 and earlier which will never see these features. They already need to concoct workarounds (which are still going to work in 3.13+). Adding an |
Closed in favor of adding os.process_cpu_count() instead: PR gh-109907. |
Implement cpu_count(affinity=True) with sched_getaffinity() on Unix
and GetProcessAffinityMask() on Windows.
Changes:
when the test completes!
that Linux cgroups are ignored.
📚 Documentation preview 📚: https://cpython-previews--109652.org.readthedocs.build/