gh-109649: Add affinity parameter to os.cpu_count() #109652

vstinner · 2023-09-21T14:28:44Z

Implement cpu_count(affinity=True) with sched_getaffinity() on Unix
and GetProcessAffinityMask() on Windows.

Changes:

Fix test_posix.test_sched_getaffinity(): restore the old CPU mask
when the test completes!
Doc: Specify that os.cpu_count() counts logicial CPUs and mention
that Linux cgroups are ignored.
_Py_popcount32() uses UINT32_C() for M1, M2 and M4 constants.
Add _Py_popcount64(). Add tests on _Py_popcount64().

Issue: os.process_cpu_count(): get the number CPUs usable by the current process #109649

📚 Documentation preview 📚: https://cpython-previews--109652.org.readthedocs.build/

Doc/library/os.rst

corona10 · 2023-09-21T16:02:36Z

Doc/library/os.rst


-   Return the number of CPUs in the system. Returns ``None`` if undetermined.
+   Return the number of logical CPUs in the system. Returns ``None`` if


You have to update the documentation that it will return the number of logical CPUs for the process if the usable is True.
Because the current os.cpu_count returns the available CPU counts from the system not the process.
It's different layer.

The doc says:

If affinity is true, return the number of logical CPUs the current process can use.

It's not clear enough? Do you want to propose a different phrasing?

That is reasonable.

I'd add "this may be less than the number of logical cpus returned by affinity=False due to OS or container limitations imposed upon the process" to make it more clear why people should want to use the affinity=True argument.

PS thanks for making it keyword only!

I do wish this API never used the term "cpu"... everything these days is really a "logical_core" and what that even means depends a lot on underlying infrastructure and platform that Python may not be able to introspect. Way too late for that though. :)

My PR adds "logical CPU" to the doc. In previous bug reports, I saw some confusion between physical CPU core, CPU packages, CPU threads, and logical CPUs.

vstinner · 2023-09-21T16:48:26Z

I updated my PR:

Rename usable parameter to affinity.
Add Windows implementation using GetProcessAffinityMask().
Add _Py_popcount64().

vstinner · 2023-09-21T16:49:12Z

I don't think that my Windows implementation works with more than 64 CPUs :-(

https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-getprocessaffinitymask#return-value

It should be fixed before this PR can be merged.

vstinner · 2023-09-21T16:55:37Z

Manual test on Windows, affinity.py:

import os
with open("output", "w") as fp:
    try:
        print(os.cpu_count(affinity=True), file=fp)
    except Exception as exc:
        print(f"ERROR: {exc!r}", file=fp)

Without CPU affinity (all CPUs):

vstinner@WIN C:\victor\python\main>PCbuild\amd64\python_d.exe affinity.py                    
vstinner@WIN C:\victor\python\main>type output
2

With CPU affinity (limit to 1 CPU):

vstinner@WIN C:\victor\python\main>start /affinity 01 PCbuild\amd64\python_d.exe affinity.py
vstinner@WIN C:\victor\python\main>type output
1

vstinner · 2023-09-21T20:28:45Z

cc @taleinat

vstinner · 2023-09-21T21:32:27Z

I mark the PR as a draft to remind myself that either I should drop Windows support for now, or I should fix Windows support, before this PR can be considered ready to be reviewed (and then merged).

As I wrote previous, my current implementation is limited to 64 CPUs on Windows which looks wrong. A Windows machine can have more than 64 CPUs: #67226 But I'm not sure if a process can be assigned to more than 64 CPUs. Well, I have to investigate :-)

gpshead · 2023-09-21T21:42:42Z

Doc/library/os.rst


+   Linux control groups, *cgroups*, are not taken in account to get the number


I feel like people wanting such an API may also want cgroups to also be considered when the real question being answered is really "How parallel am I usefully allowed to be?".

Does that need to be separated out into its own cgroups_cpuset=True flag so that people could query one or the other or both? The use cases I have in mind are all around the above question where I'd always want the combination aka min(logical_cpus, affinity_cores, cgroups_cpuset_cores).

My PR just fix the documentation to avoid any misunderstading.

I feel like people wanting such an API may also want cgroups to also be considered

It is discussed in PR #80235. So far, nobody proposes any PR to implement this.

Maybe this PR is a baby step forward :-)

Even if we decide to support cgroup in the future, I would like to propose not to use the flag name that can represent the implementation detail. If some platform suggests new things do we have to add a new flag for them?

Include/internal/pycore_bitutils.h

Lib/test/test_os.py

Implement cpu_count(affinity=True) with sched_getaffinity(). Changes: * Fix test_posix.test_sched_getaffinity(): restore the old CPU mask when the test completes! * Doc: Specify that os.cpu_count() counts *logicial* CPUs. Mention that Linux cgroups are ignored.

vstinner · 2023-09-21T23:59:43Z

I updated my PR. It's now ready for review.

Changes:

Remove Windows implementation.
Test that os.cpu_count(affinity=True) <= os.cpu_count(affinity=False).
I reverted my minor test change when cpu_count() returns None.
The doc no longer announces that os.cpu_count(affinity=True) raises an exception on error, since it falls back on os.cpu_count(affinity=False) code path is os.sched_getaffinity() is not available. It may be rephrased later when the feature will be implemented for Windows.

I mark the PR as a draft to remind myself that either I should drop Windows support for now, or I should fix Windows support, before this PR can be considered ready to be reviewed (and then merged).

I chose the easy way: remove the Windows implementation for now.

vstinner · 2023-09-22T11:04:14Z

Honestly, IMO cpu_count(affinity=True) is what most users expect by default: decide how many workers (threads or CPUs) should be spawn to maximize efficiency without killing performances. If a server has 100 CPUs but Python is limited by the admin to 2 CPUS, spawning 100 worker processes is likely to kill latency and may cause many timeout issues.

Problem: changing the default behavior is wrong for different reasons. For example, a program can simply query cpu_count() to display how many logical CPUs a machine has. It doesn't matter if this program is limited to 1 CPUs or has access to all 100 CPUs of the server. It should always display 100.

PYTHONCPUCOUNT env var and -X cpu_count cmdline option sounds complementary with my change:
#109595 (comment)

If we can use PYTHONCPUCOUNT=affinity / -X cpu_count=affinity, it would fit into the first use case (spawn worker processes). It will break the second use case which will have to modify their code to use cpu_count(affinity=False). But it's different: someone sets the env var knowing consequences.

vstinner · 2023-09-26T13:54:07Z

@gpshead @corona10: So, are you ok with this change? Do you think that we can continue this approach later to maybe add cgroups parameter (if it makes sense and if it is needed)? This approach fits with issue gh-109595 design, no?

corona10 · 2023-09-26T14:29:33Z

So, are you ok with this change?

No objection, it will need to people who get cpu_count with affinity option :)

gpshead · 2023-09-26T15:20:23Z

Same, lets do this. I'm even okay with an envvar + -X option to override the return value after our discussions (followon PR i'd assume). And if Windows only supports a return value of up to 64 for the affinity feature (that came up in one of these discussions or PRs iirc?), just document the caveat while we get guidance from windows experts on how to get a better answer there.

vstinner · 2023-09-26T15:51:17Z

I marked again my PR as a draft after a discussion with @corona10. I'm sorry about this back and forth.

It seems like there is a misunderstanding about "system CPU count" and "process CPU count". Depending on the use case, you may pick one or the other. The problem is that if you consider -X cpu_count=4 option: which value should be overriden? System CPU count or process CPU count? It becomes very blurry.

So.

I created PR gh-109907 which adds a new os.process_cpu_count(). The os.cpu_count() stays unchanged: it sticks to its documentation, it returns the number of logical CPUs of the system.

process_cpu_count(): currently gets sched_getaffinity(), but later it may read cgroups and be affected by -X cpu_count=value option.
cpu_count(): unchanged, number of system logical CPUs.

I understood that @corona10 is unhappy about this proposition because he wants to overriden the CPU count of Python applications which currently use os.cpu_count(). I understand this use case, but there are other ways like injecting a sitecustomize scripts to get a new PYTHONCPUCOUNT variable which would simply replace the whole function:

Try sitecustomize.py:

import os
ncpu = os.environ.get('PYTHONCPUCOUNT', None)
if ncpu:
    try:
        ncpu = int(ncpu)
    except ValueError:
        print(f"WARNING: invalid PYTHONCPUCOUNT value: {ncpu!r}")
    else:
        def cpu_count():
            return ncpu
        cpu_count.__doc__ = os.cpu_count.__doc__
        os.cpu_count = cpu_count

Example:

vstinner@mona$ PYTHONPATH=$PWD ./python -c "import os; print(os.cpu_count())"
12
vstinner@mona$ PYTHONCPUCOUNT=4096 PYTHONPATH=$PWD ./python -c "import os; print(os.cpu_count())"
4096
vstinner@mona$ PYTHONCPUCOUNT=xxx PYTHONPATH=$PWD ./python -c "import os; print(os.cpu_count())"
WARNING: invalid PYTHONCPUCOUNT value: 'xxx'
12

Then put your sitecustomize.py somewhere in one of sys.path directories. It also works with a magic .pth file I suppose.

corona10 · 2023-09-26T23:25:01Z

Then put your sitecustomize.py somewhere in one of sys.path directories. It also works with a magic .pth file I suppose.

No, it doesn't solve the issue what I reported, How you can inject the customer's sys.path that is already written Docker image file? User will just pull the docker image from the already stored docker container image store. From the K8S admin there is no way to control CPU count except cmdline or environment variable.

gpshead · 2023-09-27T21:05:25Z

The problem is that if you consider -X cpu_count=4 option: which value should be overridden? System CPU count or process CPU count? It becomes very blurry.

When a -X or equivalent environment variable is encountered, override all return values from os.cpu_count(). That potential feature can be documented as the user explicitly asking our API to ignore whatever the OS APIs might have said.

gpshead · 2023-09-27T21:08:07Z

Realize that no matter what, these new features don't solve anyones existing problems because they're having those problems on Python 3.12 and earlier which will never see these features. They already need to concoct workarounds (which are still going to work in 3.13+).

Adding an affinity= parameter or adding a new os.process_cpu_count() API is useful as a feature regardless of any possible future plans for ways to override the return values here.

vstinner · 2023-09-30T21:35:22Z

Closed in favor of adding os.process_cpu_count() instead: PR gh-109907.

bedevere-app bot added the awaiting review label Sep 21, 2023

bedevere-app bot mentioned this pull request Sep 21, 2023

os.process_cpu_count(): get the number CPUs usable by the current process #109649

Closed

aisk reviewed Sep 21, 2023

View reviewed changes

Doc/library/os.rst Outdated Show resolved Hide resolved

corona10 reviewed Sep 21, 2023

View reviewed changes

vstinner force-pushed the cpu_count_usable branch from 665dc28 to 6872c18 Compare September 21, 2023 16:43

vstinner changed the title ~~gh-109649: Add usable parameter to os.cpu_count()~~ gh-109649: Add affinity parameter to os.cpu_count() Sep 21, 2023

vstinner force-pushed the cpu_count_usable branch from 6872c18 to 38ba1c3 Compare September 21, 2023 16:47

vstinner force-pushed the cpu_count_usable branch from 38ba1c3 to ae114ac Compare September 21, 2023 16:53

vstinner marked this pull request as draft September 21, 2023 21:30

bedevere-app bot removed the awaiting review label Sep 21, 2023

gpshead reviewed Sep 21, 2023

View reviewed changes

Include/internal/pycore_bitutils.h Outdated Show resolved Hide resolved

Lib/test/test_os.py Outdated Show resolved Hide resolved

Lib/test/test_os.py Show resolved Hide resolved

gpshead self-assigned this Sep 21, 2023

vstinner force-pushed the cpu_count_usable branch from ae114ac to 2762144 Compare September 21, 2023 23:54

vstinner marked this pull request as ready for review September 21, 2023 23:56

bedevere-app bot added the awaiting review label Sep 21, 2023

gpshead mentioned this pull request Sep 22, 2023

os.cpu_count() should return a count assigned to a container or that the process is restricted to #80235

Closed

vstinner mentioned this pull request Sep 22, 2023

Support python -Xcpu_count=<n> feature for container environment. #109595

Closed

vstinner marked this pull request as draft September 26, 2023 15:39

bedevere-app bot removed the awaiting review label Sep 26, 2023

vstinner mentioned this pull request Sep 26, 2023

gh-109649: Add os.process_cpu_count() function #109907

Merged

vstinner closed this Sep 30, 2023

vstinner deleted the cpu_count_usable branch September 30, 2023 21:35


		Return the number of CPUs in the system. Returns ``None`` if undetermined.
		Return the number of logical CPUs in the system. Returns ``None`` if


		Linux control groups, cgroups, are not taken in account to get the number

Uh oh!

gh-109649: Add affinity parameter to os.cpu_count() #109652

gh-109649: Add affinity parameter to os.cpu_count() #109652

Uh oh!

Conversation

vstinner commented Sep 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

corona10 Sep 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vstinner Sep 21, 2023

Choose a reason for hiding this comment

Uh oh!

gpshead Sep 21, 2023

Choose a reason for hiding this comment

Uh oh!

vstinner Sep 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vstinner commented Sep 21, 2023

Uh oh!

vstinner commented Sep 21, 2023

Uh oh!

vstinner commented Sep 21, 2023

Uh oh!

vstinner commented Sep 21, 2023

Uh oh!

vstinner commented Sep 21, 2023

Uh oh!

gpshead Sep 21, 2023

Choose a reason for hiding this comment

Uh oh!

vstinner Sep 21, 2023

Choose a reason for hiding this comment

Uh oh!

corona10 Sep 21, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vstinner commented Sep 21, 2023

Uh oh!

vstinner commented Sep 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner commented Sep 26, 2023

Uh oh!

corona10 commented Sep 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gpshead commented Sep 26, 2023

Uh oh!

vstinner commented Sep 26, 2023

Uh oh!

corona10 commented Sep 26, 2023

Uh oh!

gpshead commented Sep 27, 2023

Uh oh!

gpshead commented Sep 27, 2023

Uh oh!

vstinner commented Sep 30, 2023

Uh oh!

Uh oh!

vstinner commented Sep 21, 2023 •

edited

Loading

corona10 Sep 21, 2023 •

edited

Loading

vstinner Sep 21, 2023 •

edited

Loading

vstinner commented Sep 22, 2023 •

edited

Loading

corona10 commented Sep 26, 2023 •

edited

Loading