-
Notifications
You must be signed in to change notification settings - Fork 124
Description
The workflow's DAG structure is maintained in DaskVine, whereas TaskVine is responsible for iterating over submitted tasks to evaluate their committability.
However, data dependencies between different tasks are inherently determined upon submission, but this information is overlooked at the TaskVine level.
This leads to scheduling inefficiencies, as we repeatedly evaluate tasks for committability even when their inputs have not been materialized at all, which delays the timing to find the truly runnable tasks.
For example, tasks whose inputs are unmaterialized should be enqueued in q->pending_tasks, while others are enqueued in q->ready_tasks. A cache-update message enables the pruning of its producer tasks and the scheduling of its consumer tasks, and an unlink message triggers moving a task from the ready queue to the pending queue.
Besides, graph operations on the C side are more efficient than in Python, which theoretically allows for more complex graph optimizations without Python inefficiencies becoming the bottleneck.
For example, if the only worker holding a file is lost, we can easily compute the recovery cost by iterating over its upstream tasks. Also, we can merge a subgraph of tasks and commit them as a single task to reduce scheduling latency and enhance data locality.
All graph operations are handled in TaskVine. Instead, DaskVine serves as an additional layer that uses the exposed APIs.