-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Open
Labels
questionFurther information is requestedFurther information is requested
Description
Hello! I have an idea to reduce memory latency on NUMA or CCX-based systems.
I've read older issues here on this topic, where it is argued that work stealing is more important to performance than partitioning tasks by NUMA nodes. This makes sense. I'm not familiar with the details of the TaskFlow work stealing algorithm, but would it be possible to give hints as to which threads the work should be stolen from? Given the choice of local queues to steal from, the hint would let the system favour queues from the same NUMA node. Or favour queue from hyperthreading sibling thread.
Combined with thread affinity settings, this could be beneficial for certain memory-bound applications.
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested