gh-109649: Add os.process_cpu_count() function #109907

vstinner · 2023-09-26T15:44:03Z

Fix test_posix.test_sched_getaffinity(): restore the old CPU mask when the test completes!
Doc: Specify that os.cpu_count() counts logicial CPUs.

Issue: os.process_cpu_count(): get the number CPUs usable by the current process #109649

📚 Documentation preview 📚: https://cpython-previews--109907.org.readthedocs.build/

vstinner · 2023-09-26T15:53:17Z

See previous discussions:

vstinner · 2023-09-26T15:55:32Z

Adding a new process_cpu_count() makes it easier to extend the function later for new use cases, instead of modifying the existing os.cpu_count():

Maybe read Linux cgroups
Maybe add a pid parameter to get the CPU count of another process.
-X cpu_count=value option
Future new exciting stuff :-)

About the pid parameter. First, I would like to implement this function on Windows (get process affinity), and then check if it would be possible to get the CPU affinity of another process on Windows, before considering to add a pid parameter.

vstinner · 2023-09-26T18:40:02Z

In psutil, this function is called Process.cpu_affinity(): https://psutil.readthedocs.io/en/latest/#psutil.Process.cpu_affinity

vstinner · 2023-09-29T14:19:30Z

I prefer adding a new os.process_cpu_count() function, instead of adding a process=True parameter to the existing os.cpu_count() function because:

I consider adding a pid optional argument to os.process_cpu_count()
IMO the word "process" makes it explicit that the result is specific to the process, wheras cpu_count() is system-wide

corona10 · 2023-09-29T15:19:15Z

I will take a look at it by tomorrow!

vstinner · 2023-09-29T15:26:35Z

cc @gpshead @indygreg

Modules/posixmodule.c

vstinner · 2023-09-29T23:06:11Z

@gpshead: I updated my PR to reimplement os.process_cpu_count() in Python.

Lib/os.py

vstinner · 2023-09-30T07:43:16Z

cpu_set = sched_getaffinity(0)
num_cpus = cpu_count()
return min(len(cpu_set), num_cpus) if cpu_set else num_cpus

Why do you want to return cpu_count() if it's smaller than len(sched_getaffinity(0))? It should not happen. If it happens, I would prefer to not workaround bugs, but fix the OS instead. For me, it should not happen.

sched_getaffinity() is basically a "mask" on the list of all available CPUs. It cannot be larger, unless cpu_count() ... counts CPUs differently. In that case, we should fix cpu_count() instead, no?

When I added time.monotonic(), I added an assertion in debug mode to make sure that... the clock is monotonic (doesn't go backward). Surprise, surprise. The clock is not. On some virtual machines, I saw it jumping backwards sometimes. Honestly. I just removed the assertion. Python cannot and should not work around OS bugs, but just expose them. Maybe for the worst known OS bugs, we might detect them and issue a warning. But I'm not even sure if it's worth it.

I recall that Python detects broken poll() implementation on macOS. In that case, we go further: we remove the function at runtime! (at Python startup)

I prefer to not make assumptions about hypothetical bugs, but wait until we get real concrete bug reports, and then decide how to deal with them.

Anyway, thanks for thinking about all corner cases, it's a good thing!

corona10 · 2023-09-30T11:44:12Z

Why do you want to return cpu_count() if it's smaller than len(sched_getaffinity(0))? It should not happen. If it happens, I would prefer to not workaround bugs, but fix the OS instead. For me, it should not happen

If we pass the -Xcpu_count=limited_number, it will be a more efficient implementation, we have to override os.cpu_count only. (It will automatically override os.process_cpu_count)

corona10

Overall LGTM, for the detail, I will delegate it to @gpshead

vstinner · 2023-09-30T12:48:23Z

If we pass the -Xcpu_count=limited_number, it will be a more efficient implementation, we have to override os.cpu_count only. (It will automatically override os.process_cpu_count)

I would prefer that the discussion separately the implementation and the expected behavior.

Also, can we discussion -Xcpu_count later? This PR is about adding process_cpu_count(), nothing more.

gpshead · 2023-09-30T20:58:43Z

Doc/library/os.rst

@@ -5202,6 +5200,17 @@ Miscellaneous System Information
   .. availability:: Unix.


+.. function:: process_cpu_count()
+
+   Get the number of logical CPUs usable by the current process. Returns


It is probably more accurate to say "usable by the calling thread of the current process". I believe each thread can have its own affinity.

(For parallelism planning purposes, the main thread prior to spawning others is likely what people would be calling this from anyways)

Sadly, you're right. I updated the doc.

gpshead

One documentation improvement suggested but otherwise good.

vstinner · 2023-09-30T21:18:56Z

@gpshead is right, the number of CPUs is "per thread" in thread.

Example:

import os
import threading

def sched_remove_one_cpu():
    mask = os.sched_getaffinity(0)
    mask.pop()
    os.sched_setaffinity(0, mask)

def set_affinity():
    tid = threading.get_native_id()
    print(f"thread {tid}: remove a second CPU")
    sched_remove_one_cpu()
    print(f"thread {tid}: {os.process_cpu_count()=}")

tid = threading.get_native_id()

print(f"main thread {tid}: remove a CPU")
sched_remove_one_cpu()

print(f"main thread {tid} before: {os.cpu_count()=}")
print(f"main thread {tid} before: {os.process_cpu_count()=}")

print()
thread = threading.Thread(target=set_affinity)
thread.start()
thread.join()
print()

print(f"main thread {tid} after: {os.cpu_count()=}")
print(f"main thread {tid} after: {os.process_cpu_count()=}")

Output on Linux:

main thread 2059677: remove a CPU
main thread 2059677 before: os.cpu_count()=12
main thread 2059677 before: os.process_cpu_count()=11

thread 2059678: remove a second CPU
thread 2059678: os.process_cpu_count()=10

main thread 2059677 after: os.cpu_count()=12
main thread 2059677 after: os.process_cpu_count()=11

You can see that the main thread loses a CPU when sched_remove_one_cpu() is called: os.process_cpu_count() is affected, but os.cpu_count() is not affected.

When a thread removes a second CPU, it affects os.process_cpu_count() in the thread, but the second CPU removal does not affected the main thread.

New thread inherit the mask of the caller thread, but a change in a child thread does not affect the parent thread. Well, that's the expected behavior.

Doc/library/os.rst

* Refactor os_sched_getaffinity_impl(): move variable definitions to their first assignment. * Fix test_posix.test_sched_getaffinity(): restore the old CPU mask when the test completes! * Doc: Specify that os.cpu_count() counts *logicial* CPUs. * Doc: Specify that os.sched_getaffinity(0) is related to the calling thread.

vstinner · 2023-09-30T21:30:00Z

I made a few further changes to clarify differences between cpu_count(), process_cpu_count() and sched_affinity().

vstinner · 2023-09-30T21:48:46Z

Tests / Hypothesis tests on Ubuntu (pull_request) Failing after 6m

FAIL: test_find_periodic_pattern (test.test_userstring.UserStringTest.test_find_periodic_pattern) (p='', text='')

I created issue GH-110160 for this bug.

vstinner · 2023-09-30T22:12:40Z

Unrelated CI failures:

Windows x86: FAIL: test_kbhit (test.test_msvcrt.TestConsoleIO.test_kbhit): test_msvcrt: test_getwche() failed with timeout (10 min) on GHA Windows x64 #110147 (comment)
Windows x86: FAIL: test_timeout (test.test_multiprocessing_spawn.test_manager.WithManagerTestBarrier.test_timeout): test_multiprocessing_spawn: test_timeout() failed on GHA Windows x86 #110162
Hypothesis tests on Ubuntu: FAIL: test_find_periodic_pattern (test.test_userstring.UserStringTest.test_find_periodic_pattern) (p='', text=''): test_userstring: test_find_periodic_pattern() failed on GHA Hypothesis tests on Ubuntu #110160

vstinner · 2023-09-30T22:13:05Z

Merged, thanks for reviews @gpshead and @corona10!

bedevere-bot · 2023-09-30T22:18:55Z

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot aarch64 RHEL8 3.x has failed when building commit c815210.

What do you need to do:

Don't panic.
Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
Go to the page of the buildbot that failed (https://buildbot.python.org/all/#builders/529/builds/5071) and take a look at the build logs.
Check if the failure is related to this commit (c815210) or if it is a false positive.
If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/all/#builders/529/builds/5071

Summary of the results of the build (if available):

==

Click to see traceback logs

Traceback (most recent call last):
  File "/home/buildbot/buildarea/3.x.cstratak-RHEL8-aarch64/build/Lib/threading.py", line 1066, in _bootstrap_inner
    self.run()
  File "/home/buildbot/buildarea/3.x.cstratak-RHEL8-aarch64/build/Lib/threading.py", line 1003, in run
    self._target(*self._args, **self._kwargs)
  File "/home/buildbot/buildarea/3.x.cstratak-RHEL8-aarch64/build/Lib/test/test_interpreters.py", line 484, in task
    interp = interpreters.create()
             ^^^^^^^^^^^^^^^^^^^^^
  File "/home/buildbot/buildarea/3.x.cstratak-RHEL8-aarch64/build/Lib/test/support/interpreters.py", line 25, in create
    id = _interpreters.create(isolated=isolated)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: interpreter creation failed
k


Traceback (most recent call last):
  File "<frozen importlib._bootstrap>", line 1354, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1325, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 929, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 1004, in exec_module
  File "<frozen importlib._bootstrap_external>", line 1100, in get_code
  File "<frozen importlib._bootstrap_external>", line 1199, in get_data
TypeError: descriptor 'close' for '_io.BufferedReader' objects doesn't apply to a '_io.FileIO' object

gpshead · 2023-09-30T22:53:48Z

if you're inclined to do so, the documentation improvements to the existing APIs in this PR would be worthwhile backporting to the 3.12 branch so that they show up on the default /3/ docs on python.org soon.

vstinner · 2023-10-01T01:05:02Z

@gpshead:

if you're inclined to do so, the documentation improvements to the existing APIs in this PR would be worthwhile backporting to the 3.12 branch so that they show up on the default /3/ docs on python.org soon.

It makes sense. I wrote PR gh-110169 for Python 3.12 (and added "backport to 3.11" label).

* Refactor os_sched_getaffinity_impl(): move variable definitions to their first assignment. * Fix test_posix.test_sched_getaffinity(): restore the old CPU mask when the test completes! * Doc: Specify that os.cpu_count() counts *logicial* CPUs. * Doc: Specify that os.sched_getaffinity(0) is related to the calling thread.

bedevere-app bot added the awaiting review label Sep 26, 2023

bedevere-app bot mentioned this pull request Sep 21, 2023

os.process_cpu_count(): get the number CPUs usable by the current process #109649

Closed

vstinner mentioned this pull request Sep 26, 2023

gh-109649: Add affinity parameter to os.cpu_count() #109652

Closed

vstinner force-pushed the process_cpu_count branch from e3554fa to 69c8bf4 Compare September 26, 2023 15:52

vstinner marked this pull request as draft September 27, 2023 00:41

bedevere-app bot removed the awaiting review label Sep 27, 2023

vstinner force-pushed the process_cpu_count branch from 69c8bf4 to 7832caa Compare September 29, 2023 13:32

vstinner marked this pull request as ready for review September 29, 2023 13:33

bedevere-app bot added the awaiting review label Sep 29, 2023

vstinner mentioned this pull request Sep 29, 2023

Support python -Xcpu_count=<n> feature for container environment. #109595

Closed

corona10 self-requested a review September 29, 2023 15:18

gpshead reviewed Sep 29, 2023

View reviewed changes

Modules/posixmodule.c Outdated Show resolved Hide resolved

gpshead reviewed Sep 30, 2023

View reviewed changes

Lib/os.py Show resolved Hide resolved

corona10 approved these changes Sep 30, 2023

View reviewed changes

bedevere-app bot added awaiting core review and removed awaiting review labels Sep 30, 2023

gpshead reviewed Sep 30, 2023

View reviewed changes

gpshead approved these changes Sep 30, 2023

View reviewed changes

gpshead reviewed Sep 30, 2023

View reviewed changes

Doc/library/os.rst Outdated Show resolved Hide resolved

vstinner force-pushed the process_cpu_count branch from 803f717 to 74ed3db Compare September 30, 2023 21:29

Update docstring as well

9018c69

vstinner merged commit c815210 into python:main Sep 30, 2023

vstinner deleted the process_cpu_count branch September 30, 2023 22:12

bedevere-app bot removed the awaiting core review label Sep 30, 2023

encukou mentioned this pull request Mar 28, 2024

[doc] add platform availabity information for os.sched_getaffinity #88047

Closed

Uh oh!

gh-109649: Add os.process_cpu_count() function #109907

gh-109649: Add os.process_cpu_count() function #109907

Uh oh!

Conversation

vstinner commented Sep 26, 2023 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner commented Sep 26, 2023

Uh oh!

vstinner commented Sep 26, 2023

Uh oh!

vstinner commented Sep 26, 2023

Uh oh!

vstinner commented Sep 29, 2023

Uh oh!

corona10 commented Sep 29, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner commented Sep 29, 2023

Uh oh!

Uh oh!

vstinner commented Sep 29, 2023

Uh oh!

Uh oh!

vstinner commented Sep 30, 2023

Uh oh!

corona10 commented Sep 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

corona10 left a comment

Choose a reason for hiding this comment

Uh oh!

vstinner commented Sep 30, 2023

Uh oh!

gpshead Sep 30, 2023

Choose a reason for hiding this comment

Uh oh!

vstinner Sep 30, 2023

Choose a reason for hiding this comment

Uh oh!

gpshead left a comment

Choose a reason for hiding this comment

Uh oh!

vstinner commented Sep 30, 2023

Uh oh!

Uh oh!

vstinner commented Sep 30, 2023

Uh oh!

vstinner commented Sep 30, 2023

Uh oh!

vstinner commented Sep 30, 2023

Uh oh!

vstinner commented Sep 30, 2023

Uh oh!

bedevere-bot commented Sep 30, 2023

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Uh oh!

gpshead commented Sep 30, 2023

Uh oh!

vstinner commented Oct 1, 2023

Uh oh!

Uh oh!

vstinner commented Sep 26, 2023 •

edited by github-actions bot

Loading

corona10 commented Sep 29, 2023 •

edited

Loading

corona10 commented Sep 30, 2023 •

edited

Loading