A fast C++ header-only library to help you easily develop parallel programs with complex task dependencies.
Cpp-Taskflow helps you quickly build parallel computation graphs using modern C++17. It is by far faster, more expressive, and easier for drop-in integration than existing libraries such as OpenMP Tasking and TBB FlowGraph.
The following example simple.cpp shows the basic API you need to use Cpp-Taskflow.
#include "taskflow.hpp" // the only include you need
int main(){
tf::Taskflow tf(std::thread::hardware_concurrency());
auto [A, B, C, D] = tf.silent_emplace(
[] () { std::cout << "TaskA\n"; }, // the taskflow graph
[] () { std::cout << "TaskB\n"; }, //
[] () { std::cout << "TaskC\n"; }, // +---+
[] () { std::cout << "TaskD\n"; } // +---->| B |-----+
); // | +---+ |
// +---+ +-v-+
A.precede(B); // B runs after A // | A | | D |
A.precede(C); // C runs after A // +---+ +-^-+
B.precede(D); // D runs after B // | +---+ |
C.precede(D); // C runs after D // +---->| C |-----+
// +---+
tf.wait_for_all(); // block until finished
return 0;
}
Compile and run the code with the following commands:
~$ g++ simple.cpp -std=c++1z -O2 -lpthread -o simple
~$ ./simple
TaskA
TaskC <-- concurrent with TaskB
TaskB <-- concurrent with TaskC
TaskD
Cpp-Taskflow has very expressive, neat, and clear methods to create complex dependency graphs. Most applications are developed through the following three steps.
To start a task dependency graph, create a taskflow object and specify the number of working threads in a shared thread pool to carry out tasks.
tf::Taskflow tf(std::max(1u, std::thread::hardware_concurrency()));
Create a task via the method emplace
and get a pair of Task
and future
.
auto [A, F] = tf.emplace([](){ std::cout << "Task A\n"; return 1; });
Or create a task via the method silent_emplace
, if you don't need a future
to retrieve the result.
auto [A] = tf.silent_emplace([](){ std::cout << "Task A\n"; });
Both methods implement variadic templates and can take arbitrary numbers of arguments to create multiple tasks at one time.
auto [A, B, C, D] = tf.silent_emplace(
[] () { std::cout << "Task A\n"; },
[] () { std::cout << "Task B\n"; },
[] () { std::cout << "Task C\n"; },
[] () { std::cout << "Task D\n"; }
);
Once tasks are created in the pool, you need to specify task dependencies in a
Directed Acyclic Graph (DAG) fashion.
The class Task
supports different methods for you to describe task dependencies.
Precede: Adding a preceding link forces one task to run ahead of one another.
A.precede(B); // A runs before B.
Broadcast: Adding a broadcast link forces one task to run ahead of other(s).
A.broadcast(B, C, D); // A runs before B, C, and D.
Gather: Adding a gathering link forces one task to run after other(s).
A.gather(B, C, D); // A runs after B, C, and D.
Linearize: Linearizing a task sequence adds a preceding link to each adjacent pair.
tf.linearize(A, B, C, D); // A runs before A, B runs before C, and C runs before D.
There are three methods to carry out a task dependency graph, dispatch
, silent_dispatch
, and wait_for_all
.
auto future = tf.dispatch(); // non-blocking, returns with a future immediately.
tf.dispatch(); // non-blocking, no return
Calling wait_for_all
will block until all tasks complete.
tf.wait_for_all();
Concurrent programs are notoriously difficult to debug. We suggest (1) naming tasks and dumping the graph, and (2) starting with single thread before going multiple. Currently, Cpp-Taskflow supports GraphViz format.
// debug.cpp
tf::Taskflow tf(0); // force the master thread to execute all tasks
auto A = tf.silent_emplace([] () { /* ... */ }).name("A");
auto B = tf.silent_emplace([] () { /* ... */ }).name("B");
auto C = tf.silent_emplace([] () { /* ... */ }).name("C");
auto D = tf.silent_emplace([] () { /* ... */ }).name("D");
auto E = tf.silent_emplace([] () { /* ... */ }).name("E");
A.broadcast(B, C, E);
C.precede(D);
B.broadcast(D, E);
std::cout << tf.dump();
Run the program and inspect whether dependencies are expressed in the right way.
~$ ./debug
digraph Taskflow {
"A" -> "B"
"A" -> "C"
"A" -> "E"
"B" -> "D"
"B" -> "E"
"C" -> "D"
}
There are a number of free GraphViz tools you could find online to visualize your Taskflow graph.
Taskflow with five tasks and six dependencies, generated by Viz.js.
The class tf::Taskflow
is the main place to create taskflow graphs and carry out task dependencies.
The table below summarizes its commonly used methods.
Method | Argument | Return | Description |
---|---|---|---|
Taskflow | none | none | construct a taskflow with the worker count equal to max hardware concurrency |
Taskflow | size | none | construct a taskflow with a given number of workers |
emplace | callables | tasks, futures | insert nodes to execute the given callables; results can be retrieved from the returned futures |
silent_emplace | callables | tasks | insert nodes to execute the given callables |
placeholder | none | task | insert a node without any work; work can be assigned later |
linearize | task list | none | create a linear dependency in the given task list |
parallel_for | beg, end, callable, group | task pair | apply the callable in parallel and group-by-group to the result of dereferencing every iterator in the range |
parallel_for | container, callable, group | task pair | apply the callable in parallel and group-by-group to each element in the container |
dispatch | none | future | dispatch the current graph and return a shared future to block on completeness |
silent_dispatch | none | none | dispatch the current graph |
wait_for_all | none | none | dispatch the current graph and block until all graphs including previously dispatched ones finish |
num_nodes | none | size | return the number of nodes in the current graph |
num_workers | none | size | return the number of working threads in the pool |
num_topologies | none | size | return the number of dispatched graphs |
dump | none | string | dump the current graph to a string of GraphViz format |
Each tf::Taskflow::Task
object is a lightweight handle for you to create dependencies in its associated graph.
The table below summarizes its methods.
Method | Argument | Return | Description |
---|---|---|---|
name | string | self | assign a human-readable name to the task |
work | callable | self | assign a work of a callable object to the task |
precede | task | self | enable this task to run before the given task |
broadcast | task list | self | enable this task to run before the given tasks |
gather | task list | self | enable this task to run after the given tasks |
num_dependents | none | size | return the number of dependents (inputs) of this task |
num_successors | none | size | return the number of successors (outputs) of this task |
While Cpp-Taskflow enables the expression of very complex task dependency graph that might contain thousands of task nodes and links, there are a few amateur pitfalls and mistakes to be aware of.
- Having a cycle in a graph may result in running forever.
- Trying to modify a dispatched task can result in undefined behavior.
- Touching a taskflow from multiple threads are not safe.
Cpp-Taskflow is known to work on most Linux distributions and OSX. Please let me know if you found any issues in a particular platform.
To use Cpp-Taskflow, you only need a C++17 compiler:
- GNU C++ Compiler G++ v7.2 with -std=c++1z
- Clang 5.0 C++ Compiler with -std=c++17
Cpp-Taskflow uses CMake to build examples and unit tests. We recommend using out-of-source build.
~$ cmake --version # must be at least 3.9 or higher
~$ mkdir build
~$ cd build
~$ cmake ../
~$ make
Cpp-Taskflow uses Doctest for unit tests.
~$ ./unittest/taskflow
Alternatively, you can use CMake's testing framework to run the unittest.
~$ cd build
~$ make test
The folder example/
contains several examples and is a great place to learn to use Cpp-Taskflow.
Example | Description |
---|---|
simple.cpp | use basic task building blocks to create a trivial taskflow graph |
matrix.cpp | create two set of matrices and multiply each individually in parallel |
parallel_for.cpp | parallelize a for loop with unbalanced workload |
- Report bugs/issues by submitting a Github issue.
- Submit contributions using pull requests.
- Live chat and ask questions on Gitter.
Cpp-Taskflow is being actively developed and contributed by the following people:
- Tsung-Wei Huang created the Cpp-Taskflow project and implemented the core routines.
- Chun-Xun Lin co-created the Cpp-Taskflow project and implemented the core routines.
- Martin Wong supported the Cpp-Taskflow project through NSF and DARPA funding.
- Nan Xiao fixed compilation error of unittest on the Arch platform.
See also the list of contributors who participated in this project. Please let me know if I forgot someone!
Cpp-Taskflow is being used in both industry and academic projects to scale up existing workloads that incorporate complex task dependencies. A proprietary research report has shown over 10x improvement by switching to Cpp-Taskflow.
- OpenTimer: A High-performance Timing Analysis Tool for VLSI Systems.
- DtCraft: A General-purpose Distributed Programming Systems.
Please let me know if I forgot your project!
Cpp-Taskflow is licensed under the MIT License:
Copyright © 2018 Tsung-Wei Huang, Chun-Xun Lin, Martin Wong.
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.