Work stealing hints for heterogeneous memory architectures

Hello! I have an idea to reduce memory latency on NUMA or CCX-based systems.

I've read older issues here on this topic, where it is argued that work stealing is more important to performance than partitioning tasks by NUMA nodes. This makes sense. I'm not familiar with the details of the TaskFlow work stealing algorithm, but would it be possible to give hints as to which threads the work should be stolen from? Given the choice of local queues to steal from, the hint would let the system favour queues from the same NUMA node. Or favour queue from hyperthreading sibling thread.

Combined with thread affinity settings, this could be beneficial for certain memory-bound applications.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Work stealing hints for heterogeneous memory architectures #581

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Work stealing hints for heterogeneous memory architectures #581

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions