Thanks to visit codestin.com
Credit goes to files.inria.fr

StarPU Handbook
1. Introduction

Foreword

This manual documents the version 1.4.99 of StarPU. Its contents was last updated on 2026-05-28.

Copyright © 2009-2025 University of Bordeaux, CNRS (LaBRI UMR 5800), Inria


Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation License”.

1.1 How to cite this document

Please use the following bibtex entry to cite StarPU documentation.

@Manual{StarPU_Handbook,
title = {StarPU Handbook},
note = {Available at \url{https://files.inria.fr/starpu/doc/html/}}
}

1.2 Motivation

The use of specialized hardware, such as accelerators or coprocessors, offers an interesting approach to overcoming the physical limitations faced by processor architects. As a result, many machines are now equipped with one or more accelerators (e.g. a GPU) in addition to the usual processor(s). While considerable efforts has been devoted to offloading computation to such accelerators, very little attention has been paid to portability concerns, and to the ability of heterogeneous accelerators and processors to interact with each other.

StarPU is a runtime system that provides support for heterogeneous multicore architectures. It not only provides a unified view of the computational resources (i.e. CPUs and accelerators simultaneously), but also efficiently maps and executes tasks on a heterogeneous machine while transparently handling low-level issues such as data transfers in a portable manner.

1.3 StarPU in a Nutshell

StarPU is a software tool that enables programmers to take advantage of the computational power of both CPUs and GPUs without having to painstakingly adapt their programs for specific target machines and processing units.

At the heart of StarPU is a runtime support library that handles the scheduling of tasks delivered by applications on heterogeneous CPU/GPU systems. In addition, StarPU provides programming language support through an OpenCL front-end (SOCL OpenCL Extensions).

StarPU's runtime mechanism and programming language extensions are based on a task-based programming model. In this model, applications submit computational tasks, with CPU and/or GPU implementations. StarPU efficiently schedules these tasks and manages the associated data transfers across available CPUs and GPUs. The data that a task operates on is automatically exchanged between accelerators and the main memory, freeing programmers from the intricacies of scheduling and the technical details associated to these transfers.

StarPU is characterized by its adaptability in efficiently scheduling tasks using established algorithms from the literature (Task Scheduling Policies). In addition, it provides the flexibility for scheduling experts, such as compiler or computational library developers, to implement custom scheduling policies in a way that is easily portable (How To Define A New Scheduling Policy).

The rest of this section describes the main concepts used in StarPU.

A 26 minutes video available on the StarPU website (https://starpu.gitlabpages.inria.fr/) introduces these concepts.

In addition, a series of tutorials can be found at https://starpu.gitlabpages.inria.fr/tutorials/.

One of the tutorials is available within a docker image https://starpu.gitlabpages.inria.fr/tutorials/docker/

1.3.1 Codelet and Tasks

One of the key data structures in StarPU is the codelet. A codelet defines a computational kernel that can potentially be implemented on different architectures, including CPUs, CUDA devices, or OpenCL devices.

Another key data structure is the task. Running a StarPU task involves applying a codelet to a dataset using one of the architectures on which the codelet is implemented. Therefore, a task describes the codelet it uses, the data it accesses, and how the data is accessed during the computation (read and/or write). StarPU tasks are asynchronous, i.e submitting a task to StarPU is a non-blocking operation. The task structure can also specify a callback function that will be called when StarPU succesfully completes the task. Additionally, it contains optional fields that the application can use to provide hints to the scheduler, such as priority levels.

By default, task dependencies are inferred from data dependencies (sequential coherency) within StarPU. However, the application has the ability to disable sequential coherency for specific data, and dependencies can also be specifically defined. A task can be uniquely identified by a 64-bit number, chosen by the application, called a tag. Task dependencies can be enforced by callback functions, by submitting other tasks, or by specifying dependencies between tags (which may correspond to tasks that have yet to be submitted).

1.3.2 StarPU Data Management Library

Because StarPU dynamically schedules tasks at runtime, the need for data transfers between different processing units is automatically managed in a just-in-time manner. This automated approach reduces the burden on application programmers to explicitly handle data transfers. Furthemore, to minimize unnecessary transfers, StarPU retains data at the location of its last use, even if changes have been made there. In addition, StarPU allows multiple instances of the same data to coexist on different processing units as long as the data remains unchanegd.

1.4 Application Taskification

We will shortly explain here the concept of "taskifying" an application.

Before transitioning to StarPU, you need to transform your application as follows:

  • Refactor functions into "pure" functions that only use data from their parameters.
  • Create a central main function responsible for calling these pure functions.

Once this restructuring is complete, integrating StarPU or any similar task-based library becomes straightforward. You simply replace function calls with task submissions and take advantage of the library's capabilities.

Chapter A Stencil Application shows how to easily convert an existing application to use StarPU.

1.5 Research Papers

Research papers about StarPU can be found at https://starpu.gitlabpages.inria.fr/publications/.

A good overview is available in the research report at http://hal.archives-ouvertes.fr/inria-00467677.