Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
17 views531 pages

Genode Foundations 24 05

The document is a comprehensive guide to the GENODE operating system framework, detailing its architecture, components, and usage. It covers topics such as obtaining source code, system scenarios, capability-based security, and inter-component communication. Additionally, it includes sections on device drivers, protocol stacks, and runtime environments.

Uploaded by

recibirdocs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views531 pages

Genode Foundations 24 05

The document is a comprehensive guide to the GENODE operating system framework, detailing its architecture, components, and usage. It covers topics such as obtaining source code, system scenarios, capability-based security, and inter-component communication. Additionally, it includes sections on device drivers, protocol stacks, and runtime environments.

Uploaded by

recibirdocs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 531

GENODE

Operating System Framework 24.05

Foundations
Norman Feske
Contents

Contents

1. Introduction 9
1.1. Operating-system framework . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2. Licensing and commercial support . . . . . . . . . . . . . . . . . . . . . . 16
1.3. About this document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

I. Foundations 18

2. Getting started 19
2.1. Obtaining the source code . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2. Source-tree structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3. Using the build system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4. A simple system scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5. Hello world . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.5.1. Using a custom source-code repository . . . . . . . . . . . . . . . 29
2.5.2. Source code and build description . . . . . . . . . . . . . . . . . . 29
2.5.3. Building the component . . . . . . . . . . . . . . . . . . . . . . . . 30
2.5.4. Defining a system scenario . . . . . . . . . . . . . . . . . . . . . . 31
2.5.5. Responding to external events . . . . . . . . . . . . . . . . . . . . 33
2.6. Next steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3. Architecture 38
3.1. Capability-based security . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.1.1. Capability spaces, object identities, and RPC objects . . . . . . . . 40
3.1.2. Delegation of authority and ownership . . . . . . . . . . . . . . . 41
3.1.3. Capability invocation . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.1.4. Capability delegation through capability invocation . . . . . . . . 45
3.2. Recursive system structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.2.1. Component ownership . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.2.2. Tree of components . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.2.3. Services and sessions . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.2.4. Client-server relationship . . . . . . . . . . . . . . . . . . . . . . . 51
3.3. Resource trading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.3.1. Resource assignment . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.3.2. Trading memory between clients and servers . . . . . . . . . . . . 59
3.3.3. Component-local heap partitioning . . . . . . . . . . . . . . . . . 61
3.3.4. Dynamic resource balancing . . . . . . . . . . . . . . . . . . . . . 63
3.4. Core - the root of the component tree . . . . . . . . . . . . . . . . . . . . . 65
3.4.1. Dataspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.4.2. Region maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

2
Contents

3.4.3. Access to boot modules (ROM) . . . . . . . . . . . . . . . . . . . . 66


3.4.4. Protection domains (PD) . . . . . . . . . . . . . . . . . . . . . . . . 67
3.4.5. Region-map management (RM) . . . . . . . . . . . . . . . . . . . . 68
3.4.6. Processing-time allocation (CPU) . . . . . . . . . . . . . . . . . . . 69
3.4.7. Access to device resources (IO_MEM, IO_PORT, IRQ) . . . . . . . 69
3.4.8. Logging (LOG) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.4.9. Event tracing (TRACE) . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.5. Component creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.5.1. Obtaining the child’s ROM and PD sessions . . . . . . . . . . . . 72
3.5.2. Constructing the child’s address space . . . . . . . . . . . . . . . . 73
3.5.3. Creating the initial thread . . . . . . . . . . . . . . . . . . . . . . . 75
3.6. Inter-component communication . . . . . . . . . . . . . . . . . . . . . . . 77
3.6.1. Synchronous remote procedure calls (RPC) . . . . . . . . . . . . . 78
3.6.2. Asynchronous notifications . . . . . . . . . . . . . . . . . . . . . . 86
3.6.3. Shared memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.6.4. Asynchronous state propagation . . . . . . . . . . . . . . . . . . . 91
3.6.5. Synchronous bulk transfer . . . . . . . . . . . . . . . . . . . . . . . 91
3.6.6. Asynchronous bulk transfer - packet streams . . . . . . . . . . . . 93

4. Components 96
4.1. Device drivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.1.1. Platform driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.1.2. Interrupt handling . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.1.3. Direct memory access (DMA) transactions . . . . . . . . . . . . . 100
4.2. Protocol stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.3. Resource multiplexers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.4. Runtime environments and applications . . . . . . . . . . . . . . . . . . . 107
4.5. Common session interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.5.1. Read-only memory (ROM) . . . . . . . . . . . . . . . . . . . . . . 109
4.5.2. Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.5.3. Terminal and UART . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.5.4. Event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.5.5. Capture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
4.5.6. GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
4.5.7. Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.5.8. Pin state and control . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.5.9. Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.5.10. Timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.5.11. NIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
4.5.12. Uplink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
4.5.13. Audio recording and playing . . . . . . . . . . . . . . . . . . . . . 117
4.5.14. File system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

3
Contents

4.6. Component configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . 120


4.6.1. Configuration format . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4.6.2. Server-side policy selection . . . . . . . . . . . . . . . . . . . . . . 120
4.6.3. Dynamic component reconfiguration at runtime . . . . . . . . . . 121
4.7. Component composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
4.7.1. Sandboxing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
4.7.2. Component-level and OS-level virtualization . . . . . . . . . . . . 124
4.7.3. Interposing individual services . . . . . . . . . . . . . . . . . . . . 127
4.7.4. Ceding the parenthood . . . . . . . . . . . . . . . . . . . . . . . . 129
4.7.5. Publishing and subscribing . . . . . . . . . . . . . . . . . . . . . . 130
4.7.6. Feedback control system . . . . . . . . . . . . . . . . . . . . . . . . 132

5. Development 135
5.1. Source-code repositories . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.2. Integration of 3rd-party software . . . . . . . . . . . . . . . . . . . . . . . 138
5.3. Build system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
5.3.1. Build directories . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
5.3.2. Target descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
5.3.3. Library descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.3.4. Platform specifications . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.3.5. Building tools to be executed on the host platform . . . . . . . . . 143
5.3.6. Building 3rd-party software . . . . . . . . . . . . . . . . . . . . . . 144
5.4. System integration and automated testing . . . . . . . . . . . . . . . . . . 145
5.4.1. Run tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
5.4.2. Run-tool configuration examples . . . . . . . . . . . . . . . . . . . 146
5.4.3. Meaningful default behaviour . . . . . . . . . . . . . . . . . . . . 148
5.4.4. Run scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.4.5. The run mechanism explained . . . . . . . . . . . . . . . . . . . . 150
5.4.6. Using run scripts to implement integration tests . . . . . . . . . . 151
5.4.7. Automated testing across base platforms . . . . . . . . . . . . . . 151
5.5. Package management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
5.5.1. Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
5.5.2. Depot structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
5.5.3. Depot management . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
5.5.4. Automated extraction of archives from the source tree . . . . . . 159
5.5.5. Convenience front-end to the extract, build tools . . . . . . . . . . 160
5.5.6. Accessing depot content from run scripts . . . . . . . . . . . . . . 161
5.6. Static code analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
5.7. Git flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
5.7.1. Master and staging . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
5.7.2. Development practice . . . . . . . . . . . . . . . . . . . . . . . . . 166

4
Contents

6. System configuration 169


6.1. Nested configuration concept . . . . . . . . . . . . . . . . . . . . . . . . . 171
6.2. The init component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
6.2.1. Session routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
6.2.2. Resource assignment . . . . . . . . . . . . . . . . . . . . . . . . . . 178
6.2.3. Multiple instantiation of a single ELF binary . . . . . . . . . . . . 179
6.2.4. Session-label rewriting . . . . . . . . . . . . . . . . . . . . . . . . . 179
6.2.5. Nested configuration . . . . . . . . . . . . . . . . . . . . . . . . . . 180
6.2.6. Configuring components from distinct ROM modules . . . . . . 182
6.2.7. Assigning subsystems to CPUs . . . . . . . . . . . . . . . . . . . . 182
6.2.8. Priority support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
6.2.9. Propagation of exit events . . . . . . . . . . . . . . . . . . . . . . . 184
6.2.10. State reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
6.2.11. Init verbosity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
6.2.12. Service forwarding . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
6.2.13. Component health monitoring . . . . . . . . . . . . . . . . . . . . 186

7. Under the hood 188


7.1. Component-local startup code and linker scripts . . . . . . . . . . . . . . 189
7.1.1. Linker scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
7.1.2. Startup code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
7.2. C++ runtime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
7.2.1. Rationale behind using exceptions . . . . . . . . . . . . . . . . . . 194
7.2.2. Bare-metal C++ runtime . . . . . . . . . . . . . . . . . . . . . . . . 196
7.3. Interaction of core with the underlying kernel . . . . . . . . . . . . . . . . 197
7.3.1. System-image assembly . . . . . . . . . . . . . . . . . . . . . . . . 197
7.3.2. Bootstrapping and allocator setup . . . . . . . . . . . . . . . . . . 198
7.3.3. Kernel-object creation . . . . . . . . . . . . . . . . . . . . . . . . . 199
7.3.4. Page-fault handling . . . . . . . . . . . . . . . . . . . . . . . . . . 200
7.4. Asynchronous notification mechanism . . . . . . . . . . . . . . . . . . . . 202
7.5. Parent-child interaction in detail . . . . . . . . . . . . . . . . . . . . . . . 205
7.6. Dynamic linker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
7.6.1. Building dynamically-linked programs . . . . . . . . . . . . . . . 207
7.6.2. Startup of dynamically-linked programs . . . . . . . . . . . . . . 207
7.6.3. Address-space management . . . . . . . . . . . . . . . . . . . . . 208
7.7. Execution on bare hardware (base-hw) . . . . . . . . . . . . . . . . . . . . 209
7.7.1. Bootstrapping of base-hw . . . . . . . . . . . . . . . . . . . . . . . 209
7.7.2. Kernel entry and exit . . . . . . . . . . . . . . . . . . . . . . . . . . 210
7.7.3. Interrupt handling and preemptive multi-threading . . . . . . . . 210
7.7.4. Split kernel interface . . . . . . . . . . . . . . . . . . . . . . . . . . 211
7.7.5. Public part of the kernel interface . . . . . . . . . . . . . . . . . . . 211
7.7.6. Core-private part of the kernel interface . . . . . . . . . . . . . . . 212

5
Contents

7.7.7. Scheduler of the base-hw kernel . . . . . . . . . . . . . . . . . . . 213


7.7.8. Sparsely populated core address space . . . . . . . . . . . . . . . 214
7.7.9. Multi-processor support of base-hw . . . . . . . . . . . . . . . . . 214
7.7.10. Asynchronous notifications on base-hw . . . . . . . . . . . . . . . 215
7.8. Execution on the NOVA microhypervisor (base-nova) . . . . . . . . . . . 216
7.8.1. Integration of NOVA with Genode . . . . . . . . . . . . . . . . . . 216
7.8.2. Bootstrapping of a NOVA-based system . . . . . . . . . . . . . . . 216
7.8.3. Log output on modern PC hardware . . . . . . . . . . . . . . . . . 217
7.8.4. Relation of NOVA’s kernel objects to Genode’s core services . . . 218
7.8.5. Page-fault handling on NOVA . . . . . . . . . . . . . . . . . . . . 219
7.8.6. Asynchronous notifications on NOVA . . . . . . . . . . . . . . . . 220
7.8.7. IOMMU support . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
7.8.8. Genode-specific modifications of the NOVA kernel . . . . . . . . 222
7.8.9. Known limitations of NOVA . . . . . . . . . . . . . . . . . . . . . 226

II. Reference 227

8. API 228
8.1. API primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
8.1.1. Capability types . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
8.1.2. Sessions and connections . . . . . . . . . . . . . . . . . . . . . . . 232
8.1.3. Dataspace interface . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
8.2. Component execution environment . . . . . . . . . . . . . . . . . . . . . 238
8.2.1. Interface to the component’s environment . . . . . . . . . . . . . 238
8.2.2. Parent interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
8.3. Entrypoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
8.4. Region-map interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
8.5. Session interfaces of the base API . . . . . . . . . . . . . . . . . . . . . . . 259
8.5.1. PD session interface . . . . . . . . . . . . . . . . . . . . . . . . . . 259
8.5.2. ROM session interface . . . . . . . . . . . . . . . . . . . . . . . . . 271
8.5.3. RM session interface . . . . . . . . . . . . . . . . . . . . . . . . . . 276
8.5.4. CPU session interface . . . . . . . . . . . . . . . . . . . . . . . . . 279
8.5.5. IO_MEM session interface . . . . . . . . . . . . . . . . . . . . . . . 291
8.5.6. IO_PORT session interface . . . . . . . . . . . . . . . . . . . . . . 295
8.5.7. IRQ session interface . . . . . . . . . . . . . . . . . . . . . . . . . . 299
8.5.8. LOG session interface . . . . . . . . . . . . . . . . . . . . . . . . . 300
8.6. OS-level session interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
8.6.1. Report session interface . . . . . . . . . . . . . . . . . . . . . . . . 302
8.6.2. Terminal and UART session interfaces . . . . . . . . . . . . . . . . 307
8.6.3. Event session interface . . . . . . . . . . . . . . . . . . . . . . . . . 311
8.6.4. Capture session interface . . . . . . . . . . . . . . . . . . . . . . . 312

6
Contents

8.6.5. GUI session interface . . . . . . . . . . . . . . . . . . . . . . . . . . 315


8.6.6. Capture session interface . . . . . . . . . . . . . . . . . . . . . . . 323
8.6.7. Platform session interface . . . . . . . . . . . . . . . . . . . . . . . 326
8.6.8. Block session interface . . . . . . . . . . . . . . . . . . . . . . . . . 329
8.6.9. Timer session interface . . . . . . . . . . . . . . . . . . . . . . . . . 334
8.6.10. NIC and uplink session interfaces . . . . . . . . . . . . . . . . . . 339
8.6.11. Record and play session interfaces . . . . . . . . . . . . . . . . . . 344
8.6.12. File-system session interface . . . . . . . . . . . . . . . . . . . . . 349
8.6.13. Pin state and control session interfaces . . . . . . . . . . . . . . . 364
8.7. Fundamental types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
8.7.1. Integer types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
8.7.2. Exception types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370
8.7.3. Exception-less error handling . . . . . . . . . . . . . . . . . . . . . 370
8.7.4. C++ supplements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
8.8. Data structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
8.8.1. List and registry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
8.8.2. Fifo queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
8.8.3. AVL tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384
8.8.4. Dictionary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388
8.8.5. ID space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388
8.8.6. Bit array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
8.9. Object lifetime management . . . . . . . . . . . . . . . . . . . . . . . . . . 392
8.9.1. Thread-safe weak pointers . . . . . . . . . . . . . . . . . . . . . . . 392
8.9.2. Late and repeated object construction . . . . . . . . . . . . . . . . 399
8.10. Physical memory allocation . . . . . . . . . . . . . . . . . . . . . . . . . . 403
8.11. Component-local allocators . . . . . . . . . . . . . . . . . . . . . . . . . . 406
8.11.1. Slab allocator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
8.11.2. AVL-tree-based best-fit allocator . . . . . . . . . . . . . . . . . . . 416
8.11.3. Heap and sliced heap . . . . . . . . . . . . . . . . . . . . . . . . . 421
8.11.4. Bit allocator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
8.12. String processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
8.12.1. Basic string operations . . . . . . . . . . . . . . . . . . . . . . . . . 428
8.12.2. Tokenizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442
8.12.3. Diagnostic output . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444
8.12.4. Obtaining backtraces . . . . . . . . . . . . . . . . . . . . . . . . . . 452
8.12.5. Unicode handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454
8.13. Multi-threading and synchronization . . . . . . . . . . . . . . . . . . . . . 456
8.13.1. Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456
8.13.2. Inter-thread synchronization . . . . . . . . . . . . . . . . . . . . . 462
8.14. Signalling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466

7
Contents

8.15. Remote procedure calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470


8.15.1. RPC mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470
8.15.2. Transferable argument types . . . . . . . . . . . . . . . . . . . . . 476
8.15.3. Throwing C++ exceptions across RPC boundaries . . . . . . . . . 477
8.15.4. RPC interface inheritance . . . . . . . . . . . . . . . . . . . . . . . 477
8.15.5. Casting capability types . . . . . . . . . . . . . . . . . . . . . . . . 478
8.15.6. Non-virtual RPC interface functions . . . . . . . . . . . . . . . . . 478
8.15.7. Limitations of the RPC mechanism . . . . . . . . . . . . . . . . . . 478
8.15.8. Root interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479
8.15.9. Server-side policy handling . . . . . . . . . . . . . . . . . . . . . . 484
8.15.10.Packet stream . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486
8.16. XML processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493
8.16.1. XML parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493
8.16.2. XML generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
8.16.3. XML-based data models . . . . . . . . . . . . . . . . . . . . . . . . 509
8.17. Component management . . . . . . . . . . . . . . . . . . . . . . . . . . . 513
8.17.1. Shared objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513
8.17.2. Child management . . . . . . . . . . . . . . . . . . . . . . . . . . . 515
8.17.3. Composition of subsystems . . . . . . . . . . . . . . . . . . . . . . 525
8.18. Utilities for user-level device drivers . . . . . . . . . . . . . . . . . . . . . 528
8.18.1. Register declarations . . . . . . . . . . . . . . . . . . . . . . . . . . 528
8.18.2. Memory-mapped I/O . . . . . . . . . . . . . . . . . . . . . . . . . 530

This work is licensed under the Creative Commons Attribution +


ShareAlike License (CC-BY-SA). To view a copy of the license, visit
http://creativecommons.org/licenses/by-sa/4.0/legalcode

8
1. Introduction

We are surrounded by operating systems. Each device where multiple software func-
tions are consolidated on a single CPU employs some sort of operating system that
multiplexes the physical CPU for the different functions. In our age when even mun-
dane household items get connected to the internet, it becomes increasingly hard to
find devices where this is not the case.
Our lives and our society depend on an increasing number of such devices. We have
to trust them to fulfill their advertised functionality and to not perform actions that are
against our interests. But are those devices trustworthy? In most cases, nobody knows
that for sure. Even the device vendors are unable to guarantee the absence of vulnera-
bilities or hidden functions. This is not by malice. The employed commodity software
stacks are simply too complex to reason about them. Software is universally known to
be not perfect. So we have seemingly come to accept the common practice where ven-
dors provide a stream of software and firmware updates that fix vulnerabilities once
they become publicly known. Building moderately complex systems that are free from
such issues appears to be unrealistic. Why is that?

Universal truths The past decades have provided us with enough empirical evidence
about the need to be pragmatic about operating-system software. For example, high-
assurance systems are known to be expensive and struggle to scale. Consequently,
under cost pressure, we can live without high assurance. Security is considered as
important. But at the point where the user gets bothered by it, we have to be willing
to compromise. Most users would agree that guaranteed quality of service is desirable.
But to attain good utilization of cheap hardware, we have to sacrifice such guarantees.
Those universal truths have formed our expectations of commodity operating system
software.

Assurance Scalability

In markets where vendors are held liable for the correctness of their products, phys-
ical separation provides the highest assurance for the independence and protection of
different functions from each other. For example, cars contain dozens of electronic con-
trol units (ECU) that can be individually evaluated and certified. However, cost con-
siderations call for the consolidation of multiple functions on a single ECU. At this
point, separation kernels are considered to partition the hardware resources into iso-
lated compartments. Because the isolation is only as strong as the correctness of the
isolation kernel, such kernels must undergo a thorough evaluation. In the face of being
liable, an oversight during the evaluation may have disastrous consequences for the
vendor. Each line of code to be evaluated is an expense. Hence, separation kernels are
minimized to the lowest possible complexity - up to only a few thousand lines of code.

9
The low complexity of separation kernels comes at the cost of being inflexible. Be-
cause the hardware resources are partitioned at system-integration time, dynamic
workloads are hard to accommodate. The rigidity of the approach stands in the way
whenever the number of partitions, the assignment of resources to partitions, and the
software running in the partitions have to be changed at runtime.
Even though the high level of assurance as provided by separation kernels is gener-
ally desirable, flexibility and the support for dynamic workloads is even more so. For
this reason, commodity general-purpose OSes find their way into all kinds of devices
except into those where vendors are held liable for the correctness of their products.
The former include not only household appliances, network gear, consumer electron-
ics, mobile devices, and certain comfort functions in vehicles but also the IT equipment
of governments, smart-city appliances, and surveillance systems. To innovate quickly,
vendors accept to make their products reliant on highly complex OS foundations. The
trusted computing base (TCB) of all commodity general-purpose operating systems is
measured in millions of lines of code. It comprises all the software components that
must be trusted to not violate the interests of the user. This includes the kernel, the
software executed at the system start, all background services with system privileges,
and the actual application software. In contrast to separation kernels, any attempt to
assess the correct functioning of the involved code is shallow at best. The trustworthi-
ness of such a system remains uncertain to vendors and users alike. The uncertainty
that comes with the staggering TCB complexity becomes a problem when such systems
get connected to the internet: Is my internet router under control of a bot net? Is my
mobile phone remotely manipulated to wiretap me? Is my TV spying on me when
switched off? Are my sensitive documents stored on my computer prone to leakage?
Faithfully, we hope the answers to those question to be no. But because it is impossible
to reason about the trusted computing base of the employed operating systems, there
are no answers.
Apparently, the lack of assurance must be the price to pay for the accommodation of
feature-rich dynamic workloads.

Security Ease of use

The ease of use of software systems is often perceived as diametrical to security.


There are countless mundane examples: Remembering passwords of sufficient strength
is annoying. Even more so is picking a dedicated password for each different pur-
pose. Hence, users tend to become lax about choosing and updating passwords. An-
other example is OpenPGP. Because setting it up for secure email communication is
perceived as complicated, business-sensitive information is routinely exchanged unen-
crypted. Yet another example is the lack of adoption of the security frameworks such as
SELinux. Even though they are readily available on commodity OS distributions, com-

10
prehending and defining security policies is considered as a black art, which is better
left to experts.
How should an operating system strike the balance between being unusably secure
and user-friendly insecure?

Utilization Accountability

Current-generation general-purpose OSes are designed to utilize physical resources


like memory, network bandwidth, computation time, and power in the best way pos-
sible. The common approach to maximize utilization is the over-provisioning of re-
sources to processes. The OS kernel pretends the availability of an unlimited amount of
resources to each process in the hope that processes will attempt to allocate and utilize
as much resources as possible. Its holistic view on all processes and physical resources
puts the kernel in the ideal position to balance resources between processes. For ex-
ample, if physical memory becomes scarce, the kernel is able to uphold the illusion of
unlimited memory by temporarily swapping the memory content of inactive processes
to disk.
However, the optimization for high utilization comes at the price of indeterminism
and effectively makes modern commodity OSes defenseless against denial-of-service
attacks driven by applications. For example, because the network load is not accounted
to individual network-using applications, a misbehaving network-heavy application is
able to degrade the performance of other network applications. As another example,
any GUI application is able to indirectly cause a huge memory consumption at the GUI
server by creating an infinite amount of windows. If the system eventually runs out of
memory, the kernel will identify the GUI server as the offender.
With the help of complex heuristics like process-behaviour-aware schedulers, the
kernel tries hard to uphold the illusion of unlimited resources when under pressure.
But since the physical resources are ultimately limited, this abstraction is destined to
break sooner or later. If it breaks, the consequences may be fatal: In an out-of-memory
situation, the last resort of the kernel is to rampage and kill arbitrary processes.
Can an operating system achieve high resource utilization while still being depend-
able?

Clean-slate approach Surprisingly, by disregarding the practical considerations of


existing commodity operating systems, the contradictions outlined above can be re-
solved by a combination of the following key techniques:

Microkernels as a middle ground between separation kernels and monolithic kernels


are able to accommodate dynamic workloads without unreasonably inflating the
trusting computing base.

11
Figure 1: Application-specific trusted computing base

Capability-based security supposedly makes security easy to use by providing an


intuitive way to manage authority without the need for an all-encompassing and
complex global system policy.

Kernelization of software components aids the deconstruction of complex software


into low-complexity security-sensitive parts and high-complexity parts. The lat-
ter no longer need to be considered as part of the trusted computing base.

Virtualization can bridge the gap between applications that expect current-generation
OSes and a new operating-system design.

The management of budgets within hierarchical organizations shows how limited


resources can be utilized and still be properly accounted for.

None of those techniques is new by any means. However, they have never been used
as a composition of a general-purpose operating system. This is where Genode comes
into the picture.

Application-specific trusted computing base A Genode system is structured as a


tree of components where each component (except for the root of the tree) is owned
by its parent. The notion of ownership means both responsibility and control. Being
responsible for its children, the parent has to explicitly provide the resources needed
by its children out of its own resources. It is also responsible to acquaint children with
one another and the outside world. In return, the parent retains ultimate control over

12
each of its children. As the owner of a child, it has ultimate power over the child’s envi-
ronment, the child’s view of the system, and the lifetime of the child. Each child can, in
turn, have children, which yields a recursive system structure. Figure 1 illustrates the
idea.
At the root of the tree, there is a low-complexity microkernel that is always part of
the TCB. The kernel is solely responsible to provide protection domains, threads of
execution, and the controlled communication between protection domains. All other
system functions such as device drivers, network stacks, file systems, runtime environ-
ments, virtual machines, security functions, and resource multiplexers are realized as
components within the tree.
The rigid organizational structure enables the system designer to tailor the trusted
computing base for each component individually. For example, by hosting a crypto-
graphic function nearby the root of the tree, the function is exposed only to the micro-
kernel but not to complex drivers and protocol stacks that may exist in other branches
of the tree. Figure 1 illustrates the TCB of one leaf node. The TCB of the yellow compo-
nent comprises the chain of parents and grandparents because it is directly or indirectly
owned by them. Furthermore, the TCB comprises a service used by the component. But
the right branch of tree is unrelated to the component and can thereby disregarded from
the yellow component’s TCB.

Trading and tracking of physical resources Unlike traditional operating systems,


Genode does not abstract from physical resources. Instead, each component has a bud-
get of physical resources assigned by its parent. The budget allows the component to
use the resources within the budget or to assign parts of its budget to its children. The
usage and assignment of budgets is a deliberative decision by each component rather
than a global policy of the OS kernel. Components are able to trade resource bud-
gets along the branches of the tree. This way, components can offer services to other
components without consuming their own resources. The dynamic trading of resource
budgets between components allows for a high resource utilization without the over-
provisioning of resources. Consequently, the system behavior remains deterministic at
all times.

13
1.1 Operating-system framework

1.1. Operating-system framework

The Genode OS framework is the implementation of the Genode architecture. It is


a tool kit for building highly secure special-purpose operating systems. It scales from
embedded systems with as little as 4 MB of memory to highly dynamic general-purpose
workloads.
The system is based on a recursive structure. Each program is executed in a dedicated
sandbox and gets granted only those access rights and resources that are required to
fulfill its specific purpose. Programs can create and manage sub-sandboxes out of their
own resources, thereby forming hierarchies where policies can be applied at each level.
The framework provides mechanisms to let programs communicate with each other
and trade their resources, but only in strictly-defined manners. Thanks to this rigid
regime, the attack surface of security-critical functions can be reduced by orders of
magnitude compared to contemporary operating systems.
The framework aligns the construction principles of microkernels with Unix philos-
ophy. In line with Unix philosophy, Genode is a collection of small building blocks,
out of which sophisticated systems can be composed. But unlike Unix, those building
blocks include not only applications but also all classical OS functionalities including
kernels, device drivers, file systems, and protocol stacks.

CPU architectures
Genode supports the x86 (32 and 64 bit), ARM (32 and 64 bit), and RISC-V (64
bit) CPU architectures. On x86, modern architectural features such as IOMMUs
and hardware virtualization can be utilized. On ARM, Genode is able to take
advantage of TrustZone and virtualization technology.

Kernels
Genode can be deployed on a variety of different kernels including most mem-
bers of the L4 family (NOVA, seL4, Fiasco.OC, OKL4 v2.1, L4ka::Pistachio, L4/Fi-
asco). Furthermore, it can be used on top of the Linux kernel to attain rapid
development-test cycles during development. Additionally, the framework is ac-
companied with a custom microkernel that has been specifically developed for
Genode and thereby further reduces the complexity of the trusted computing base
compared to other kernels.

Virtualization
Genode supports virtualization at different levels:
• On NOVA, faithful virtualization via VirtualBox allows the execution of un-
modified guest operating systems as Genode subsystems. Alternatively, the
Seoul virtual machine monitor can be used to run unmodified Linux-based
guest OSes.

14
1.1 Operating-system framework

• On ARM, Genode can be used as TrustZone monitor, or as a virtual machine


monitor that facilitates ARM’s virtualization extensions.

Building blocks
There exist hundreds of ready-to-use components such as
• Device drivers for most common PC peripherals including networking, stor-
age, display, USB, PS/2, Intel wireless, and audio output.
• Device drivers for a variety of ARM-based SoCs, in particular the NXP i.MX
family.
• A GUI stack including a low-complexity GUI server, window management,
and widget toolkits such as Qt5.
• Networking components such as TCP/IP stacks and packet-level network
services.
• Applications based on the POSIX interface, including GNU coreutils, bash,
GCC, binutils, and findutils.

15
1.2 Licensing and commercial support

1.2. Licensing and commercial support

Genode is commercially supported by the German company Genode Labs GmbH,


which offers trainings, development work under contract, developer support, and
commercial licensing:

Genode Labs website


https://www.genode-labs.com

The framework is available under two flavours of licences: an open-source license and
commercial licensing. The primary license used for the distribution of the Genode OS
framework is the GNU Affero General Public License Version 3 (AGPLv3). In short, the
AGPLv3 grants everybody the rights to

• Use the Genode OS framework without paying any license fee,

• Freely distribute the software,

• Modify the source code and distribute modified versions of the software.

In return, the AGPLv3 requires any modifications and derived work to be published
under the same or a compatible license. For the full license text, refer to

GNU Affero General Public License Version 3


https://genode.org/about/LICENSE

Note that the official license text accompanies the AGPLv3 with an additional clause
that clarifies our consent to link Genode with all commonly established Open-Source
licenses.
For applications that require more permissive licensing conditions than granted by
the AGPLv3, Genode Labs offers the option to commercially license the technology
upon request. Please write to [email protected].

16
1.3 About this document

1.3. About this document

This document is split into two parts. Whereas the first part contains the textual de-
scription of the architectural and practical foundations, the second part serves as a ref-
erence of the framework’s programming interface. This allows the first part to stay
largely clear from implementation details. Cross-references between both parts are
used to connect the conceptual level with the implementation level.
Chapter 2 provides engineering-minded readers with a practical jump start to explore
the code and experiment with it. These practical steps are good to get a first impression
and will hopefully provide the motivation to engage with the core part of the book,
which are the Chapters 3 and 4.
Chapter 3 introduces Genode’s high-level architecture by presenting the concept of
capability-based security, the resource-trading mechanism, the root of the component
tree, and the ways how components can collaborate without mutually trusting each
other. Chapter 4 narrows the view on different types of components, namely device
drivers, protocol stacks, resource multiplexers, runtime environments, and applica-
tions. The remaining part of the chapter focuses on the composition of components.
Chapter 5 substantiates Chapter 2 with all information needed to develop meaning-
ful components. It covers the integration of 3rd-party software, the build system, the
tool kit for automated testing, and the Git work flow of the regular Genode developers.
Chapter 6 addresses the system integration. After presenting Genode’s holistic con-
figuration concept, it details the usage of the init component, which bootstraps the static
part of each Genode system.
Chapter 7 closes the first part with a look behind the scenes. It provides the de-
tails and the rationales behind technical decisions, explains the startup procedure of
components, shows how Genode’s concepts are mapped to kernel mechanisms, and
documents known limitations.
The second part of the document gives an overview of the framework’s C++ pro-
gramming interface. The content is partially derived from the actual source code and
supplemented with additional background information.

Acknowledgements and feedback This document greatly benefited from the feed-
back of the community at the Genode mailing list, the wonderful team at Genode Labs,
the thorough review by Adrian-Ken Rueegsegger and Reto Buerki, and several anony-
mous reviewers. Thanks to everyone who contributed to the effort, be it in the form of
reviews, comments, moral support, or through projects commissioned to Genode Labs.
That said, feedback from you as the reader of the document is always welcome. If you
identify points you would like to see improved or if you spot grammatical errors, please
do not hesitate to contact the author by writing to [email protected] or to
post your feedback to the mailing list https://genode.org/community/mailing-lists.

17
Part I.
Foundations

18
2. Getting started

Genode can be approached from two different angles: as an operating-system architec-


ture or as a practical tool kit. This chapter assists you with exploring Genode as the
latter. After introducing the recommended development environment, it guides you
through the steps needed to obtain the source code (Section 2.1), to use the tool chain
(Section 2.3), to test-drive system scenarios (Section 2.4), and to create your first custom
component from scratch (Section 2.5).

Recommended development environment Genode is regularly used and devel-


oped on GNU/Linux. It is recommended to use the latest long-term support (LTS)
version of Ubuntu. Make sure that your installation satisfies the following require-
ments:

• GNU Make version 3.81 (or newer) needed by the build system,

• libsdl2-dev, libdrm-dev, and _libgbm-dev needed to run interactive system scenarios


directly on Linux,

• tclsh and expect needed by test-automation and work-flow tools,

• xmllint for validating configurations,

• qemu, xorriso, sgdisk, and e2tools needed for running system scenarios on non-
Linux platforms via the Qemu emulator.

For using the entire collection of ported 3rd-party software, the following packages
should be installed additionally: byacc, autoconf2.64, autogen, bison, flex, g++, git, gperf,
libxml2-utils, subversion, and xsltproc.

Seeking help The best way to get assistance while exploring Genode is to consult
the mailing list, which is the primary communication medium of regular users and
developers alike. Please feel welcome to join in!

Mailing Lists
https://genode.org/community/mailing-lists

If you encounter a new bug, ambiguous documentation, or a missing feature, please


consider opening a corresponding issue at the issue tracker:

Issue tracker
https://github.com/genodelabs/genode/issues

19
2.1 Obtaining the source code

2.1. Obtaining the source code

The centerpiece of Genode is the source code found within the official Git repository:

Source code at GitHub


https://github.com/genodelabs/genode

To obtain the source code, clone the Git repository:

git clone https://github.com/genodelabs/genode.git

After cloning, you can find the source code within the genode/ directory. In the fol-
lowing, we refer to this directory as <genode-dir>.
Git checks out the most recent Genode master branch commit, so, let’s now switch to
the version used in this manual:

cd <genode-dir>
git checkout -b 24.05 24.05

20
2.2 Source-tree structure

2.2. Source-tree structure

Top-level directory At the root of the directory tree, there is the following content:

doc/ Documentation in plain text format, including the release notes of all versions.
Practical hint: The comprehensive release notes conserve most of the hands-on
documentation aggregated over the lifetime of the project. When curious about a
certain topic, it is often worthwhile to “grep” for the topic within the release notes
to get a starting point for investigation.

tool/ Tools and scripts to support the build system, various boot loaders, the tool
chain, and the management of 3rd-party source code. Please find more infor-
mation in the README file contained in the subdirectory.

repos/ The so-called source-code repositories, which contain the actual source code
of the framework components. The source code is not organized within a single
source tree but multiple trees. Each tree is called a source-code repository and has
the same principle structure. At build time, a set of source-code repositories can
be selected to be incorporated into the build process. Thereby, the source-code
repositories provide a coarse-grained modularization of the framework.

Repositories overview The <genode-dir>/repos/ directory contains the following


source-code repositories.

base/
The fundamental framework interfaces as well as the platform-agnostic parts of
the core component (Section 3.4).

base-<platform>/ Platform-specific supplements of the base/ repository where <plat-


form> corresponds to one of the following:
linux
Linux kernel (both x86_32 and x86_64).
nova
NOVA microhypervisor. More information about the NOVA platform is pro-
vided by Section 7.8.
hw
The hw platform allows the execution of Genode on bare hardware without
the need for a third-party kernel. The kernel functionality is included in the
core component. It supports the 32-bit ARM, 64-bit ARM, 64-bit x86, and
64-bit RISC-V CPU architectures. More information about the hw platform
can be found in Section 7.7.

21
2.2 Source-tree structure

sel4
The seL4 microkernel combines the L4-kernel philosophy with formal veri-
fication. The support for this kernel is experimental.
foc
Fiasco.OC is a modernized version of the L4/Fiasco microkernel with a com-
pletely revised kernel interface fostering capability-based security.
okl4
OKL4 kernel originally developed at Open-Kernel-Labs.
pistachio
L4ka::Pistachio kernel developed at University of Karlsruhe.
fiasco
L4/Fiasco kernel originally developed at Technische Universität Dresden.

os/
OS components such as the init component, device drivers, and basic system ser-
vices.

demo/
Various services and applications used for demonstration purposes, for example
the graphical application launcher and the tutorial browser described in Section
2.4 can be found here.

hello_tutorial/
Tutorial for creating a simple client-server scenario. This repository includes doc-
umentation and the complete source code.

libports/
Ports of popular open-source libraries, most importantly the C library. Among the
3rd-party libraries are the standard C++ library, Qt5, FreeType, ncurses, libUSB,
curl, lwip, and Mesa.

dde_linux/
Device-driver environment for executing Linux kernel subsystems as user-level
components. Besides hosting ports of generic Linux kernel subsystems such as
the TCP/IP stack, it is the basis for many board-specific drivers hosted in other
repositories.

dde_ipxe/
Device-driver environment for executing network drivers of the iPXE project.

22
2.2 Source-tree structure

dde_bsd/
Device-driver environment for audio drivers ported from OpenBSD.

dde_rump/
Port of rump kernels, which are used to execute subsystems of the NetBSD ker-
nel as user-level components. The repository contains a server that uses a rump
kernel to provide various NetBSD file systems.

pc/
Device drivers for x86 PC hardware. The pc repository depends on the _dde_linux
repository because drivers such as the USB stack, the Intel wireless stack, or the
Intel graphics driver are based on the framework infrastructure of dde_linux/.

ports/
Ports of 3rd-party applications.

gems/
Components that use both native Genode interfaces as well as features of other
high-level repositories, in particular shared libraries provided by libports/.

In addition to the repositories hosted in Genode’s main source tree, there exist a num-
ber of external repositories that extend the framework with optional features such as
additional components and board support for various hardware platforms.

Additional repositories maintained by Genode Labs


https://github.com/orgs/genodelabs/repositories

23
2.3 Using the build system

2.3. Using the build system

Genode relies on a custom tool chain, which can be downloaded at the following web-
site:

Tool chain
https://genode.org/download/tool-chain

Build directory The build system never touches the source tree but generates object
files, libraries, and programs in a dedicated build directory. We do not have a build
directory yet. For a quick start, let us create one using the following command:

cd <genode-dir>
./tool/create_builddir x86_64

To follow the subsequent steps of test driving the Linux version of Genode, the specified plat-
form argument should match your host OS installation. If you are using a 32-bit installation,
specify x86_32 instead of x86_64.
The command creates a new build directory at build/x86_64.

Build configuration Before using the build directory, it is recommended to revisit


and possibly adjust the build configuration, which is located in the etc/ subdirectory
of the build directory, e. g., build/x86_64/etc/. The build.conf file contains global build
parameters, in particular the selection of source-code repositories to be incorporated,
the kernel to use (KERNEL), and the targeted board (BOARD). It is also a suitable place for
adding global build options. For example, for enabling GNU make to use 4 CPU cores,
use the following line in the build.conf file:

MAKE += -j4

You may also consider speeding up your workflow by enabling the use of the com-
piler cache (ccache) using the following line:

CCACHE := yes

Building components The recipe for building a component has the form of a tar-
get.mk file within the src/ directory of one of the source-code repositories. For example,
the target.mk file of the init component is located at <genode-dir>/repos/os/src/init/tar-
get.mk. To build the component, execute the following command from within the build
directory:

24
2.3 Using the build system

make init

The argument “init” refers to the path relative to the src/ subdirectory. The build sys-
tem determines and builds all targets found under this path in all source-code repos-
itories. When the build is finished, the resulting executable binary can be found in
a subdirectory that matches the target’s path. Additionally, the build system installs
a symbolic link in the bin/ subdirectory that points to the executable binary. It also in-
stalls symbolic links to the debug version of the executable binary along with its symbol
information at the bin/debug/ subdirectory.
If the specified path contains multiple target.mk files in different subdirectories, the
build system builds all of them. For example, the following command builds all targets
found within one of the <repo>/src/drivers/ subdirectories:

make drivers

Static libraries are implicitly built whenever needed by a dependent target. Shared
libraries can be built by specifying lib/<name> as target where <name> corresponds
to the name of the library. For example, the following command builds the vfs library
from the library description found at repos/os/lib/mk/vfs.mk. The result can be found
within the build directory at var/libcache/vfs/.

make lib/vfs

Furthermore, it is possible to specify multiple targets at once. The following com-


mand builds both the init component, the nitpicker GUI server component, and the vfs
library:

make init server/nitpicker lib/vfs

25
2.4 A simple system scenario

2.4. A simple system scenario

The build directory offers much more than an environment for building components.
It supports the automation of system-integration work flows, which typically include
the following steps:

1. Building a set of components,

2. Configuring the static part of a system scenario,

3. Assembling a boot directory with all ingredients needed by the scenario,

4. Creating a boot image that can be loaded onto the target platform,

5. Booting the target platform with the boot image,

6. Validating the behavior of the scenario.

The recipe for such a sequence of steps can be expressed in the form of a so-called run
script. Each run script represents a system scenario and entails all information required
to reproduce the scenario. Run scripts can reside within the run/ subdirectory of any
source-code repository.
Genode comes with a ready-to-use run script showcasing a simple graphical demo
scenario. The run script is located at <genode-dir>/repos/os/run/demo.run. It leverages
Genode’s package-management tools for assembling a static system image. The pack-
age management is explained in detail in Section 5.5. For now, we can skip the details
by instructing the build system to automatically create packages and update their ver-
sion numbers for us. Uncomment the following line in the build/x86_64/build.conf file:

RUN_OPT += --depot-auto-update

In contrast to the building of individual components as described in the previous


section, the integration of a complete system scenario requires us to select a particular
OS kernel to use. The following command instructs the build system to build, integrate,
and start the “run/demo” scenario on the Linux kernel:

make run/demo KERNEL=linux BOARD=linux

The command prompts the build system to lookup a run script called demo.run in
all repositories listed in etc/build.conf. It will eventually find the run script within the
os/ repository. After completing the build of all components needed, the command
will then automatically start the scenario. Because the build directory was created for
the x86_64 platform and we specified “linux” as KERNEL, the scenario will be executed

26
2.4 A simple system scenario

directly on the host system where each Genode component resides in a distinct Linux
process. To explore the scenario, follow the instructions given by the graphical tutorial
browser.
The terminal where the make run/demo command was issued displays the log out-
put of the Genode system. To cancel the execution, hit control-c in the terminal.

Targeting a microkernel Whereas the ability to run system scenarios on top of Linux
allows for the convenient and rapid development of components and protocols, Gen-
ode is primarily designed for the use of microkernels. The choice of the microkernel
to use is up to the user of the framework and may depend on various factors like the
feature set, the supported hardware architectures, the license, or the development com-
munity. To execute the demo scenario directly on the NOVA microhypervisor, the fol-
lowing preparatory steps are needed:

1. Download the 3rd-party source code of the NOVA microhypervisor

<genode-dir>/tool/ports/prepare_port nova

The prepare_port tool downloads the source code of NOVA to a subdirectory at


<genode-dir>/contrib/nova-<hash>/ where <hash> uniquely refers to the prepared
version of NOVA.

2. On real hardware, the scenario needs a framebuffer driver. The VESA driver relies
on a 3rd-party x86-emulation library in order to execute the VESA BIOS code.
Download the 3rd-party source code of the x86emu library:

<genode-dir>/tool/ports/prepare_port x86emu

The source code will be downloaded to <genode-dir>/contrib/x86emu-<hash>/.

3. To handle USB devices, a device-driver environment for executing a Linux kernel


subsystem as a user-level component is used. Download the 3rd-party source
code of the dde_linux repository:

<genode-dir>/tool/ports/prepare_port linux jitterentropy

The source code will be downloaded to <genode-dir>/contrib/linux-<hash>/ respec-


tively.

4. To boot the scenario as an operating system on a PC, a boot loader is needed. The
build process produces a bootable disk or ISO image that includes the GRUB2
boot loader as well as a working boot-loader configuration. Download the boot
loader as ingredient for the image-creation step.

27
2.4 A simple system scenario

<genode-dir>/tool/ports/prepare_port grub2

5. Since NOVA supports the x86_64 architecture of our build directory, we can keep
using the existing build directory that we just used for Linux. However, apart
from enabling the parallelization of the build process as mentioned in Section 2.3,
we need to incorporate the libports, dde_linux, and pc source-code repositories into
the build process by uncommenting the corresponding lines in the configuration.
Otherwise the build system would fail to build the VESA and USB HID drivers,
which reside within libports/, dde_linux/, and pc/ respectively.

With those preparations in place, issue the execution of the demo run script from within
the build directory:

make run/demo KERNEL=nova BOARD=pc

This time, an instance of Qemu will be started to execute the demo scenario. The
Qemu command-line arguments appear in the log output. Depending on the used
Qemu version, one may need to tweak some of those arguments, for example by re-
moving the -display sdl option in your etc/build.conf file.
As suggested by the arguments, the scenario is supplied to Qemu as an ISO image re-
siding at var/run/demo.iso. The ISO image can not only be used with Qemu but also with
a real machine. For example, creating a bootable USB stick with the system scenario is
as simple as writing the ISO image onto a USB stick:

sudo dd if=var/run/demo.iso of=/dev/<usb-device> bs=8M conv=fsync

Note that <usb-device> refers to the device node of an USB stick. It can be determined
using the dmesg command after plugging-in the USB stick. For booting from the USB
stick, you may need to adjust the BIOS settings of the test machine accordingly.

28
2.5 Hello world

2.5. Hello world

This section introduces the steps needed to create and execute a simple custom compo-
nent that prints a hello-world message.

2.5.1. Using a custom source-code repository

In principle, it would be possible to add a new component to one of the existing


source-code repositories found at <genode-dir>/repos/. However, unless the component
is meant to be incorporated into upstream development of the Genode project, it is gen-
erally recommended to keep custom code separate from Genode’s code base. This eases
future updates to new versions of Genode and allows you to pick a revision-control
system of your choice.
The new repository must appear within the <genode-dir>/repos/ directory. This can
be achieved by either hosting it as a subdirectory or by creating a symbolic link that
points to an arbitrary location of your choice. For now, let us host a new source-code
repository called “lab” directly within the repos/ directory.

cd <genode-dir>
mkdir repos/lab

The lab repository will contain the source code and build rules for a single component
as well as a run script for executing the component within Genode. Component source
code reside in a src/ subdirectory. By convention, the src/ directory contains further
subdirectories for hosting different types of components, in particular server (services
and protocol stacks), drivers (hardware-device drivers), and app (applications). For the
hello-world component, an appropriate location would be src/app/hello/:

mkdir -p repos/lab/src/app/hello

2.5.2. Source code and build description

The hello/ directory contains both the source code and the build description of the com-
ponent. The main part of each component typically resides in a file called main.cc.
Hence, for a hello-world program, we have to create the repos/lab/src/app/hello/main.cc
file with the following content:

29
2.5 Hello world

#include <base/component.h>
#include <base/log.h>

void Component::construct(Genode::Env &)


{
Genode::log("Hello world");
}

The base/component.h header contains the interface each component must implement.
The construct function is called by the component’s execution environment to initial-
ize the component. The interface to the execution environment is passed as argument.
This interface allows the application code to interact with the outside world. The sim-
ple example above merely produces a log message. The log function is defined in the
base/log.h header.
The component does not exit after the construct function returns. Instead, it be-
comes ready to respond to requests or signals originating from other components. The
example above does not interact with other components though. Hence, it will just
keep waiting infinitely.
Please note that there exists a recommended coding style for genuine Genode com-
ponents. If you consider submitting your work to the upstream development of the
project, please pay attention to these common guidelines.

Coding-style guidelines
https://genode.org/documentation/developer-resources/coding_style

The source file main.cc is accompanied by a build-description file called target.mk. It


contains the declarations for the source files, the libraries used by the component, and
the name of the component. Create the file repos/lab/src/app/hello/target.mk with the fol-
lowing content:

TARGET = hello
SRC_CC = main.cc
LIBS += base

2.5.3. Building the component

With the build-description file in place, it is time to build the new component, for
example from within the x86_64 build directory as created in Section 2.4. To aid the
build system to find the component, we have to extend the build configuration <build-
dir>/etc/build.conf by appending the following line:

30
2.5 Hello world

REPOSITORIES += $(GENODE_DIR)/repos/lab

By adding this line, the build system will consider our custom source-code repository.
To build the component, issue the following command:

make app/hello

This step compiles the main.cc file and links the executable ELF binary called “hello”.
The result can be found in the <build-dir>/app/hello/ subdirectory.

2.5.4. Defining a system scenario

For testing the component, we need to define a system scenario that incorporates the
component. As mentioned in Section 2.4, such a description has the form of a run
script. To equip the lab repository with a run script, we first need to create a lab/run/
subdirectory:

mkdir <genode-dir>/repos/lab/run

Within this directory, we create the file <genode-dir>/repos/lab/run/hello.run with the


following content:

build { core init lib/ld app/hello }


create_boot_directory
install_config {
<config>
<parent-provides>
<service name="LOG"/>
<service name="PD"/>
<service name="CPU"/>
<service name="ROM"/>
</parent-provides>
<default-route>
<any-service> <parent/> </any-service>
</default-route>
<default caps="100"/>
<start name="hello">
<resource name="RAM" quantum="10M"/>
</start>
</config>
}
build_boot_image [build_artifacts]
append qemu_args "-nographic -m 64"
run_genode_until {Hello world.*\n} 10

31
2.5 Hello world

This run script performs the following steps:

1. It builds the components core, init, the dynamic linker lib/ld, and app/hello.

2. It creates a fresh boot directory at <build-dir>/var/run/hello. This directory contains


all files that will end up in the final boot image.

3. It creates a configuration for the init component. The configuration starts the hello
component as the only child of init. Session requests originating from the hello
component will always be directed towards the parent of init, which is core. The
<default> node declares that each component may consume up to 100 capabili-
ties.

4. It assembles a boot image with the executable ELF binaries produced by the build
step. The binaries are picked up from the <build-dir>/bin/ subdirectory.

5. It instructs Qemu (if used) to disable the graphical output.

6. It triggers the execution of the system scenario and watches the log output for
the given regular expression. The execution ends when the log output appears or
after a timeout of 10 seconds.

The run script can be executed from within the build directory via the command:

make run/hello KERNEL=linux BOARD=linux

After the boot output of the used kernel, the scenario will produce the following
output:

[init -> hello] Hello world

Run script execution successful.

The label within the brackets at the start of each line identifies the component
where the message originated from. The final line is printed by the run tool after it
successfully matched the log output against the regular expression specified to the
run_genode_until command.

32
2.5 Hello world

2.5.5. Responding to external events

Most non-trivial components respond to external events such as user input, timer
events, device interrupts, the arrival of new data, or RPC requests issued by other
components.
The following example presents the typical skeleton of such a component. The
construct function merely creates an object representing the application as a static
local variable. The actual component code lives inside the Main class.

#include <base/component.h>
#include <base/log.h>
#include <timer_session/connection.h>

namespace Hello { struct Main; }

struct Hello::Main
{
Genode::Env &_env;

Timer::Connection _timer { _env };

void _handle_timeout()
{
Genode::log("woke up at ", _timer.elapsed_ms(), " ms");
}

Genode::Signal_handler<Main> _timeout_handler {
_env.ep(), *this, &Main::_handle_timeout };

Main(Genode::Env &env) : _env(env)


{
_timer.sigh(_timeout_handler);
_timer.trigger_periodic(1000*1000);
Genode::log("component constructed");
}
};

void Component::construct(Genode::Env &env)


{
static Hello::Main main(env);
}

First, note the Hello namespace. As a good practice, component code typically lives
in a namespace. The component-specific namespace may incorporate other namespaces
- in particular the Genode namespace - without polluting the global scope.

33
2.5 Hello world

The constructor of the Main object takes the Genode environment as argument and
stores it as the reference member variable _env. The member variable is prefixed with
an underscore to highlight the fact that it is private to the Main class. In principle, Main
could be a class with _env being part of the private section, but as Main is the top-
level class of the component that is not accessed by any other parts of the program, we
use a struct for brevity while still maintaining the convention to prefix private mem-
bers with an underscore character. When spotting the use of such a prefixed variable
in the code, we immediately see that it is part of the code’s object context, not being an
argument or a local variable.
By aggregating a Timer::Connection as a member variable, the Main object requests
a session to a timer service at construction time. As this session request requires an in-
teraction with the outside world, the _env needs to be passed to the _timer constructor.
In order to respond to events from the timer, the Main class hosts a _timeout_handler
object. Its constructor arguments refer to the object and a method to be executed when-
ever an event occurs. The timeout handler object is registered at the _timer as the
recipient of timeout events via the sigh method. Finally, the timer is instructed to
trigger timeout events at a rate of 1 second.
The following remarks are worth noting:

• The programming style emphasizes what the component is rather than what the
component does.

• The component does not perform any dynamic memory allocation.

• When called, the _handle_timeout method has its context (the Main object) read-
ily available, which makes the application of internal state changes as response to
external events very natural.

• Neither the construct function nor the Main::_handle_timeout method blocks


for external events.

• The component does not receive any indication about the number of occurred
events, just the fact that at least one event occurred. The _handle_timeout code
explicitly requests the current time from the timer driver via the synchronous RPC
call elapsed_ms.

To execute the new version of the component, we need to slightly modify the run script.

34
2.5 Hello world

build { core init lib/ld timer app/hello }


create_boot_directory
install_config {
<config>
<parent-provides>
<service name="LOG"/>
<service name="PD"/>
<service name="CPU"/>
<service name="ROM"/>
</parent-provides>
<default-route>
<any-service> <parent/> <any-child/> </any-service>
</default-route>
<default caps="100"/>
<start name="timer">
<resource name="RAM" quantum="1M"/>
<provides> <service name="Timer"/> </provides>
</start>
<start name="hello">
<resource name="RAM" quantum="10M"/>
</start>
</config>
}
build_boot_image [build_artifacts]
append qemu_args "-nographic -m 64"
run_genode_until forever

The modifications are as follows:

• Since the hello component now relies on a timer service, we need to build and
integrate a timer driver into the scenario by extending the build step accordingly.

• We instruct init to spawn the timer driver as an additional component by adding


a <start> node to init’s configuration. Within this node, we declare that the
component provides a service of type “Timer”.

• To enable the hello component to open a “Timer” session at the timer driver, the
default route is modified to consider any children as servers whenever the re-
quested service is not provided by the parent.

• This time, we let the scenario run forever so that we can watch the messages
printed at periodic intervals.

When starting the run script, we can observe the periodic activation of the component
in the log output:

35
2.5 Hello world

[init] child "timer" announces service "Timer"


[init -> hello] component constructed
[init -> hello] woke up at 12 ms
[init -> hello] woke up at 1008 ms
[init -> hello] woke up at 2005 ms
...

36
2.6 Next steps

2.6. Next steps

There are several possible ways to continue your exploration.

1. To form a mental model of how Genode works, give Chapter 3 a read. In par-
ticular, Sections 3.1, 3.2, 3.3, and 3.6 deserve your attention. Section 4.7 nicely
complements this theoretic material with a number of illustrative examples.

2. If you are eager to see Genode’s potential unfolded, try out the Genode-based
operating system called Sculpt OS, which is available as a downloadable system
image along with thorough documentation.
Sculpt OS
https://genode.org/download/sculpt

3. Genode comprises a feature set of hundreds of ready-to-use components, which


are fun to discover and to combine. Most components are not only accompanied
with documentation in the form of README files local to the respective source
code, but come with at least one ready-to-use run script or test package, which
illustrates the integration of the component in executable form. After familiar-
izing yourself with the concept of run scripts by reading Section 5.4, review the
run/ and recipes/pkg/ subdirectories of the various source-code repositories, for
example those at repos/os/run/ or repos/gems/run/.

4. To follow the work of the Genode community, learn about current lines of work,
plans, and experiences, head to the Genodians.org community blog.
Genodians.org
https://genodians.org

37
3. Architecture

Contemporary operating systems are immensely complex to accommodate a large va-


riety of applications on an ever diversifying spectrum of hardware platforms. Among
the functionalities provided by a commodity operating system are device drivers, pro-
tocol stacks such as file systems and network protocols, the management of hardware
resources, as well as the provisioning of security functions. The latter category is meant
for protecting the confidentiality and integrity of information and the lifelines of critical
functionality. For assessing the effectiveness of such a security function, two questions
must be considered. First, what is the potential attack surface of the function? The an-
swer to this question yields an assessment about the likelihood of a breach. Naturally,
if there is a large number of potential attack vectors, the security function is at high risk.
The second question is: What is the reach of a defect? If the compromised function has
unlimited access to all information processed on the system, the privacy of all users
may be affected. If the function is able to permanently install software, the system may
become prone to back doors.
Today’s widely deployed operating systems do not isolate security-critical functions
from the rest of the operating system. In contrary, they are co-located with most other
operating-system functionality in a single high-complexity kernel. Thereby, those func-
tions are exposed to the other parts of the operating system. The likelihood of a security
breach is as high as the likelihood of bugs in an overly complex kernel. In other words:
It is certain. Moreover, once an in-kernel function has been compromised, the defect
has unlimited reach throughout the system.
The Genode architecture was designed to give more assuring answers to the two
questions stated. Each piece of functionality should be exposed to only those parts of
the system, on which it ultimately depends. But it remains hidden from all unrelated
parts. This minimizes the attack surface on individual security functions and thereby
reduces the likelihood for a security breach. In the event that one part of the system
gets compromised, the scope of the defect is limited to the particular fragment and
its dependent parts. Unrelated functionalities remain unaffected. To realize this idea,
Genode composes the system out of many components that interact with each other.
Each component serves a specific role and uses well-defined interfaces to interact with
its peers. For example, a network driver accesses a physical network card and provides
a bidirectional stream of network packets to another component, which, in turn, may
process the packets using a TCP/IP stack and a network application. Even though the
network driver and the TCP/IP stack cooperate when processing network packets, they
are living in separate protection domains. So a bug in one component cannot observe
or corrupt the internal state of another.
Such a component-based architecture, however, raises a number of questions, which
are addressed throughout this chapter. Section 3.1 explains how components can co-
operate without inherently trusting each other. Section 3.2 answers the questions of
who defines the relationship between components and how components become ac-

38
quainted with each other. An operating system ultimately acts on physical hardware
resources such as memory, CPUs, and peripheral devices. Section 3.4 describes how
such resources are made available to components. Section 3.5 answers the question of
how a new component comes to life. The variety of relationships between components
and their respective interfaces call for different communication primitives. Section 3.6
introduces Genode’s inter-component communication mechanisms in detail.

39
3.1 Capability-based security

3.1. Capability-based security

This section introduces the nomenclature and the general model of Genode’s capability-
based security concept. The Genode OS framework is not tied to one kernel but sup-
ports a variety of kernels as base platforms. On each of those base platforms, Genode
uses different kernel mechanisms to implement the general model as closely as pos-
sible. Note however that not all kernels satisfy the requirements that are needed to
implement the model securely. For assessing the security of a Genode-based system,
the respective platform-specific implementation must be considered. Sections 7.7 and
7.8 provide details for selected kernels.

3.1.1. Capability spaces, object identities, and RPC objects

Each component lives inside a protection domain that provides an isolated execution
environment.

Protection domain

Genode provides an object-oriented way of letting components interact with each


other. Analogously to object-oriented programming languages, which have the notion
of objects and pointers to objects, Genode introduces the notion of RPC objects and
capabilities to RPC objects.
An RPC object provides a remote-procedure call (RPC) interface. Similar to a regular
object, an RPC object can be constructed and accessed from within the same program.
But in contrast to a regular object, it can also be called from the outside of the compo-
nent. What a pointer is to a regular object, a capability is to an RPC object. It is a token
that unambiguously refers to an RPC object. In the following, we represent an RPC
object as follows.

Protection domain
RPC object

The circle represents the capability associated with the RPC object. Like a pointer
to an object, that can be used to call a function of the pointed-to object, a capability
can be used to call functions of its corresponding RPC object. However, there are two
important differences between a capability and a pointer. First, in contrast to a pointer
that can be created out of thin air (e. g., by casting an arbitrary number to a pointer),
a capability cannot be created without an RPC object. At the creation time of an RPC
object, Genode creates a so-called object identity that represents the RPC object in the
kernel. Figure 2 illustrates the relationship of an RPC object and its object identity.

40
3.1 Capability-based security

Protection domain
RPC object A 3

kernel

Cap space
0 1 2 3 4 5 ...

ow
ns
Object
identity
A

Figure 2: Relationship between an RPC object and its corresponding object identity.

For each protection domain, the kernel maintains a so-called capability space, which
is a name space that is local to the protection domain. At the creation time of an RPC
object, the kernel creates a corresponding object identity and lets a slot in the protection
domain’s capability space refer to the RPC object’s identity. From the component’s
point of view, the RPC object A has the name 3. When interacting with the kernel, the
component can use this number to refer to the RPC object A.

3.1.2. Delegation of authority and ownership

The second difference between a pointer and a capability is that a capability can be
passed to different components without losing its meaning. The transfer of a capabil-
ity from one protection domain to another delegates the authority to use the capability
to the receiving protection domain. This operation is called delegation and can be per-
formed only by the kernel. Note that the originator of the delegation does not diminish
its authority by delegating a capability. It merely shares its authority with the receiving
protection domain. There is no superficial notion of access rights associated with a ca-
pability. The possession of a capability ultimately enables a protection domain to use it
and to delegate it further. A capability should hence be understood as an access right.
Figure 3 shows the delegation of the RPC object’s capability to a second protection do-
main and a further delegation of the capability from the second to a third protection
domain. Whenever the kernel delegates a capability from one to another protection
domain, it inserts a reference to the RPC object’s identity into a free slot of the target’s
capability space. Within protection domain 2 shown in Figure 3, the RPC object can

41
3.1 Capability-based security

Protection domain 1 Protection domain 2 Protection domain 3

RPC object A 3 5 2

delegate delegate
kernel

Cap space 1 Cap space 2 Cap space 3


0 1 2 3 4 5 ... 0 1 2 3 4 5 ... 0 1 2 3 4 5 ...

ow
n s
Object
identity
A

Figure 3: The transitive delegation of a capability from one protection domain to others.

be referred to by the number 5. Within protection domain 3, the same RPC object is
known as 2. Note that the capability delegation does not hand over the ownership of
the object identity to the target protection domain. The ownership is always retained
by the protection domain that created the RPC object.
Only the owner of an RPC object is able to destroy it along with the corresponding
object identity. Upon destruction of an object identity, the kernel removes all references
to the vanishing object identity from all capability spaces. This effectively renders the
RPC object inaccessible for all protection domains. Once the object identity for an RPC
object is gone, the owner can destruct the actual RPC object.

3.1.3. Capability invocation

Capabilities enable components to call methods of RPC objects provided by different


protection domains. A component that uses an RPC object plays the role of a client
whereas a component that owns the RPC object acts in the role of a server. The inter-
play between client and server is very similar to a situation where a program calls a
local function. The caller deposits the function arguments at a place where the callee
will be able to pick them up and then passes control to the callee. When the callee takes
over control, it obtains the function arguments, executes the function, copies the results
to a place where the caller can pick them up, and finally hands back the control to the
caller. In contrast to a program-local function call, however, client and server are differ-
ent threads in their respective protection domains. The thread at the server side is called

42
3.1 Capability-based security

Client PD Server PD
5 3 RPC object A

7 RPC object B

EP ‰

Figure 4: The RPC object A and B are associated with the server’s entrypoint. A client has a
capability for A but not for B. For brevity, the kernel-protected object identities are not
depicted. Instead, the dashed line between the capabilities shows that both capabili-
ties refer to the same object identity.

entrypoint denoting the fact that it becomes active only when a call from a client enters
the protection domain or when an asynchronous notification comes in. Each compo-
nent has at least one initial entrypoint, which is created as part of the component’s
execution environment.
Protection domain

EP ‰

The wiggly arrow denotes that the entrypoint is a thread. Besides being a thread that
waits for incoming requests, the entrypoint is responsible for maintaining the associ-
ation between RPC objects and their corresponding capabilities. The previous figures
illustrated this association with the link between the RPC object and its capability. In
order to become callable from the outside, an RPC object must be associated with a
concrete entrypoint. This operation results in the creation of the object’s identity and
the corresponding capability. During the lifetime of the object identity, the entrypoint
maintains the association between the RPC object and its capability in a data struc-
ture called object pool, which allows for looking up the matching RPC object for a given
capability. Figure 4 shows a scenario where two RPC objects are associated with one
entrypoint in the protection domain of a server. The capability for the RPC object A has
been delegated to a client.
If a protection domain is in possession of a capability, each thread executed within
this protection domain can issue a call to a member function of the RPC object that is
referred to by the capability. Because this is not a normal function call but the invocation
of an object located in a different protection domain, this operation has to be provided
by the kernel. Figure 5 illustrates the interaction of the client, the kernel, and the server.
The kernel operation takes the client-local name of the invoked capability, the opcode
of the called function, and the function arguments as parameters. Upon entering the
kernel, the client’s thread is blocked until it receives a response. The operation of the

43
3.1 Capability-based security

Client PD Server PD

f n(args)
call(f n, args)
5 3 RPC object A

7 RPC object B

A → dispatch(f n, args)

call(5, f n, args)
EP ‰

request(3, f n, args)

kernel

Figure 5: Control flow between client and server when the client calls a method of an RPC
object.

kernel is represented by the dotted line. The kernel uses the supplied local name as
an index into the client’s capability space to look up the object identity, to which the
capability refers. Given the object identity, the kernel is able to determine the protection
domain and the corresponding entrypoint that is associated with the object identity
and wakes up the entrypoint’s thread with information about the incoming request.
Among this information is the server-local name of the capability that was invoked.
Note that the kernel has translated the client-local name to the corresponding server-
local name. The capability name spaces of client and server are entirely different. The
entrypoint uses this number as a key into its object pool to find the locally implemented
RPC object A that belongs to the invoked capability. It then performs a method call
of the so-called dispatch function on the RPC object. The dispatch function maps the
supplied function opcode to the matching member function and calls this function with
the request arguments.
The member function may produce function results. Once the RPC object’s mem-
ber function returns, the entrypoint thread passes the function results to the kernel by
performing the kernel’s reply operation. At this point, the server’s entrypoint becomes
ready for the next request. The kernel, in turn, passes the function results as return
values of the original call operation to the client and wakes up the client thread.

44
3.1 Capability-based security

caparg

Client capability space


lookup object identity for caparg 0 1 2 3 4 5 ...

Object
identity no
success?

Server capability space yes


0 1 2 3 4 5 ... find cap for object identity
lookup
failed
no
create and insert capnew found?

yes
captranslated = capnew captranslated = capf ound captranslated = invalid

captranslated

Figure 6: Procedure of delegating a capability specified as RPC argument from a client to a


server.

3.1.4. Capability delegation through capability invocation

Section 3.1.2 explained that capabilities can be delegated from one protection domain
to another via a kernel operation. But it left open the question of how this procedure
works. The answer is the use of capabilities as RPC message payload. Similar to how
a caller of a regular function can pass a pointer as an argument, a client can pass a ca-
pability as an argument to an RPC call. In fact, passing capabilities as RPC arguments
or results is synonymous to delegating authority between components. If the kernel
encounters a capability as an argument of a call operation, it performs the steps illus-
trated in Figure 6. The local names are denoted as cap, e.g., caparg is the local name of
the object identity at the client side, and captranslated is the local name of the same object
identity at the server side.

1. The kernel looks up the object identity in the capability space of the client. This
lookup may fail if the client specified a number of an empty slot of its capability
space. Only if the lookup succeeds is the kernel able to obtain the object identity
referred to by the argument. Note that under no circumstances can the client refer
to object identities, for which it has no authority because it can merely specify
the object identities reachable through its capability space. For all non-empty

45
3.1 Capability-based security

slots of its capability space, the protection domain was authorized to use their
referenced object identities by the means of prior delegations. If the lookup fails,
the translation results in an invalid capability passed to the server.

2. Given the object identity of the argument, the kernel searches the server’s capa-
bility space for a slot that refers to the object identity. Note that the term “search”
does not necessarily refer to an expensive linear search. The efficiency of the op-
eration largely depends on the kernel implementation.

3. If the server already possesses a capability to the object identity, the kernel trans-
lates the argument to the server-local name when passing it as part of the request
to the server. If the server does not yet possess a capability to the argument, the
kernel installs a new entry into the server’s capability space. The new entry refers
to the object identity of the argument. At this point, the authority over the object
identity has been delegated from the client to the server.

4. The kernel passes the translated or just-created local name of the argument as part
of the request to the server.

Even though the above description covered the delegation of a single capability speci-
fied as argument, it is possible to delegate more than one capability with a single RPC
call. Analogously to how capabilities can be delegated from a client to a server as argu-
ments of an RPC call, capabilities can be delegated in the other direction as part of the
reply of an RPC call. The procedure in the kernel is the same in both cases.

46
3.2 Recursive system structure

Child


all
cr e

inst
ate
Parent
‰
Parent

Figure 7: Initial relationship between a parent and a newly created child.

3.2. Recursive system structure

The previous section introduced capability delegation as the fundamental mechanism


to share authority over RPC objects between protection domains. But in the given ex-
amples, the client was already in possession of a capability to the server’s RPC object.
This raises the question of how do clients get acquainted to servers?

3.2.1. Component ownership

In a Genode system, each component (except for the very first component called core)
has a parent, which owns the component. The ownership relation between a parent and
a child is two-fold.

Child
owns

Parent

On the one hand, ownership stands for responsibility. Each component requires phys-
ical resources such as the memory or in-kernel data structures that represent the com-
ponent in the kernel. The parent is responsible for providing a budget of those physical
resources to the child at the child’s creation time but also during the child’s entire life-
time. As the parent has to assign a fraction of its own physical resources to its children,
it is the parent’s natural interest to maintain the balance of the physical resources split
between itself and each of its children. Besides being the provider of resources, the par-
ent defines all aspects of the child’s execution and serves as the child’s primary point
of contact for seeking acquaintances with other components.

47
3.2 Recursive system structure

On the other hand, ownership stands for control. Because the parent has created its
children out of its own resources, it is in the position to exercise ultimate power over its
children. This includes the decision to destruct a child at any time in order to regain the
resources that were assigned to the child. But it is also in control over the relationships
of the child with other components known to the parent.
Each new component is created as an empty protection domain. It is up to the par-
ent to populate the protection domain with code and data, and to create a thread that
executes the code within the protection domain. At creation time, the parent installs
a single capability called parent capability into the new protection domain. The parent
capability enables the child to perform RPC calls to the parent. The child is unaware of
anything else that exists in the Genode system. It does not even know its own identity
nor the identity of its parent. All it can do is issue calls to its parent using the parent
capability. Figure 7 depicts the situation right after the creation of a child component.
A thread in the parent component created a new protection domain and a thread resid-
ing in the protection domain. It also installed the parent capability referring to an RPC
object provided by the parent. To provide the RPC object, the parent has to maintain an
entrypoint. For brevity, entrypoints are not depicted in this and the following figures.
Section 3.5 covers the procedure of creating a component in detail.
The ownership relation between parent and child implies that each component has
to inherently trust its parent. From a child’s perspective, its parent is as powerful as the
kernel. Whereas the child has to trust its parent, a parent does not necessarily need to
trust its children.

3.2.2. Tree of components

The parent-child relationship is not limited to a single level. Child components are free
to use their resources to create further children, thereby forming a tree of components.
Figure 8 shows an example scenario. The init component creates subsystems accord-
ing to its configuration. In the example, it created two children, namely a GUI and a
launcher. The latter allows the user to interactively create further subsystems. In the
example, launcher was used to start an application.
At each position in the tree, the parent-child interface is the same. The position of a
component within the tree is just a matter of composition. For example, by a mere con-
figuration change of init, the application could be started directly by the init component
and would thereby not be subjected to the launcher.

3.2.3. Services and sessions

The primary purpose of the parent interface is the establishment of communication


channels between components. Any component can inform its parent about a service
that it provides. In order to provide a service, a component needs to create an RPC
object implementing the so-called root interface. The root interface offers functions for

48
3.2 Recursive system structure

Application

GUI Launcher

Init

Figure 8: Example of a tree of components. The red arrow represents the ownership relation.

creating and destroying sessions of the service. Figure 9 shows a scenario where the
GUI component announces its service to the init component. The announce function
takes the service name and the capability for the service’s root interface as arguments.
Thereby, the root capability is delegated from the GUI to init.
It is up to the parent to decide what to do with the announced information. The par-
ent may ignore the announcement or remember that the child “GUI” provides a service
“GUI”. A component can announce any number of services via subsequent announce
calls.
The counterpart of the service announcement is the creation of a session by a client
by issuing a session request to its parent. Figure 10 shows the scenario where the ap-
plication requests a “GUI” session. Along with the session call, the client specifies the
type of the service and a number of session arguments. The session arguments enable
the client to inform the server about various properties of the desired session. In the
example, the client informs the server that the client’s window should be labeled with
the name “browser”. As a result of the session request, the client expects to obtain a ca-
pability to an RPC object that implements the session interface of the requested service.
Such a capability is called session capability.
When the parent receives a session request from a child, it is free to take a policy
decision on how to respond to the request. This decision is closely related to the man-
agement of resources described in Section 3.3.2. There are the following options.

Parent denies the service The parent may deny the request and thereby prevent the
child from using a particular service.

Parent provides the service The parent could decide to implement the requested
service by itself by handing out a session capability for a locally implemented
RPC object to the child.

Server is another child If the parent has received an announcement of the service
from another child, it may decide to direct the session request to the other child.

49
3.2 Recursive system structure

Application


GUI Launcher
GUI root 3 Parent

 

announce("GUI", 3)

Init
Parent Parent

Figure 9: The GUI component announces its service to its parent using the parent interface.

Application


session("GUI", label:browser)

GUI Launcher
GUI root Parent

 

session("GUI", label:arora)
announce("GUI", 3)
Init
Parent Parent

Figure 10: The application requests a GUI session using the parent interface.

50
3.2 Recursive system structure

Forward to grandparent The parent may decide to request a session in the name of
its child from its own parent.

Figure 10 illustrates the latter option where the launcher responds to the application’s
session request by issuing a session request to its parent, the init component. Note
that by requesting a session in the name of its child, the launcher is able to modify the
session arguments according to its policy. In the example, the launcher imposes the
use of a different label to the session. When init receives the session request from the
launcher, it is up to init to take a policy decision with the same principle options. In
fact, each component that sits in between the client and the server along the branches
of the ownership tree can impose its policy onto sessions. The routing of the session
request and the final session arguments as received by the server are the result of the
successive application of all policies along the route.
Because the GUI announced its “GUI” service beforehand, init is in possession of the
root capability, which enables it to create and destroy GUI sessions. It decides to re-
spond to the launcher’s session request by triggering the GUI-session creation at the
GUI component’s root interface. The GUI component responds to this request with
the creation of a new GUI session and attaches the received session arguments to the
new session. The accumulated session policy is thereby tied to the session’s RPC ob-
ject. The RPC object is accompanied with its corresponding session capability, which
is delegated along the entire call chain up to the originator of the session request (Sec-
tion 3.1.2). Once the application’s session request returns, the application can interact
directly with the GUI session using the session capability.
The differentiation between session creation and session use aligns two seemingly
conflicting goals with each other, namely efficiency and the application of the security
policies by potentially many components. All components on the route between client
and server are involved in the creation of the session and can thereby impose their
policies on the session. Once established, the direct communication channel between
client and server via the session capability allows for the efficient interaction between
the two components. For the actual use of the session, the intermediate components are
not on the performance-critical path.

3.2.4. Client-server relationship

Whereas the role of a component as a child is dictated by the strict ownership relation
that implies that the child has to ultimately trust its parent, the role of a component as
client or server is more diverse.
In its role of a client that obtained a session capability as result of a session request
from its parent, a component is unaware of the real identity of the server. It is unable
to judge the trustworthiness of the server. However, it obtained the session from its
parent, which the client ultimately trusts. Whichever session capability was handed
out by the parent, the client is not in the position to question the parent’s decision.

51
3.2 Recursive system structure

Application

GUI
GUI session
session("GUI", label:browser)
create
Launcher
GUI root Parent

 

session(label:arora) session("GUI", label:arora)


announce("GUI", 3)
Init
Parent Parent
‰

Figure 11: Session creation at the server.

Even though the integrity of the session capability can be taken for granted, the client
does not need to trust the server in the same way as it trusts its parent. By invoking
the capability, the client is in full control over the information it reveals to the server
in the form of RPC arguments. The confidentiality and integrity of its internal state is
protected. Furthermore, the invocation of a capability cannot have side effects on the
client’s protection domain other than the retrieval of RPC results. So the integrity of
the client’s internal state is protected. However, when invoking a capability, the client
hands over the flow of execution to the server. The client is blocked until the server
responds to the request. A misbehaving server may never respond and thereby block
the client infinitely. Therefore, with respect to the liveliness of the client, the client has
to trust the server. To empathize with the role of a component as a client, a capability
invocation can be compared to the call of a function of an opaque 3rd-party library.
When calling such a library function, the caller can never be certain to regain control.
It just expects that a function returns at some point. However, in contrast to a call of a
library function, a capability invocation does not put the integrity and confidentiality
of the client’s internal state at risk.

Servers do not trust their clients When exercising the role of a server, a component
should generally not trust its clients. On the contrary, from the server’s perspective,
clients should be expected to misbehave. This has two practical implications. First, a
server is responsible for validating the arguments of incoming RPC requests. Second, a

52
3.2 Recursive system structure

server should never make itself dependent on the good will of its clients. For example,
a server should generally not invoke a capability obtained from one of its clients. A ma-
licious client could have delegated a capability to a non-responding RPC object, which
may block the server forever when invoked and thereby make the server unavailable
for all clients. As another example, the server must always be in control over the physi-
cal memory resources used for a shared-memory interface between itself and its clients.
Otherwise, if a client was in control over the used memory, it could revoke the memory
from the server at any time, possibly triggering a fault at the server. The establishment
of shared memory is described in detail in Section 3.6.3. Similarly to the role as client,
the internal state of a server is protected from its clients with respect to integrity and
confidentiality. In contrast to a client, however, the liveliness of a server is protected as
well. A server never needs to wait for any response from a client. By responding to an
RPC request, the server does immediately become ready to accept the next RPC request
without any prior handshake with the client of the first request.

Ownership and lifetime of a session The object identity of a session RPC object
and additional RPC objects that may have been created via the session is owned by the
server. So the server is in control over the lifetime of those RPC objects. The client is not
in the immediate position to dictate the server when to close a session because it has
no power over the server. Instead, the procedure of closing a session follows the same
chain of commands as involved in the session creation. The common parent of client
and server plays the role of a broker, which is trusted by both parties. From the client’s
perspective, closing a session is a request to its parent. The client has to accept that the
response to such a request is up to the policy of the parent. The closing of a session can
alternatively be initiated by all nodes of the component tree that were involved in the
session creation.
From the perspective of a server that is implemented by a child, the request to close
a session originates from its parent, which, as the owner of the server, represents an
authority that must be ultimately obeyed. If the server complies, the object identity
of the session’s RPC object vanishes. Since the kernel invalidates capabilities once their
associated RPC object is destroyed, all capabilities referring to the RPC object - however
delegated - are implicitly revoked as a side effect. Still, a server may ignore the session-
close request. In this case, the parent of a server might take steps to enforce its will by
destructing the server altogether.

Trustworthiness of servers Servers that are shared by clients of different security


levels must be designed and implemented with special care. Besides the correct re-
sponse to session-close requests, another consideration is the adherence to the security
policy as configured by the parent. The mere fact that a server is a child of its parent
does not imply that the parent won’t need to trust it in some respects.

53
3.2 Recursive system structure

In cases where is not viable to trust the server (e. g., because the server is based on
ported software that is too complex for thorough evaluation), certain security prop-
erties such as the effectiveness of closing sessions could be enforced by a small (and
thereby trustworthy) intermediate server that sits in-between the real server and the
client. This intermediate server would then effectively wrap the server’s session inter-
face.

54
3.3 Resource trading

3.3. Resource trading

As introduced in Section 3.2.1, child components are created out of the resources of
their respective parent components. This section describes the underlying mechanism.
It first introduces the concept of PD sessions as resource accounts in Section 6.2.2. Sec-
tion 3.3.2 explains how PD sessions are used to trade resources between components.
The resource-trading mechanism ultimately allows servers to become resilient against
client-driven resource-exhaustion attacks. However, such servers need to take special
precautions that are explained in Section 3.3.3. Section 3.3.4 presents a mechanism for
the dynamic balancing of resources among cooperative components.

3.3.1. Resource assignment

In general, it is the operating system’s job to manage the physical resources of the ma-
chine in a way that enables multiple applications to utilize them in a safe and efficient
manner. The physical resources are foremost the physical memory, the processing time
of the CPUs, and devices.

The traditional approach to resource management Traditional operating systems


usually provide abstractions of physical resources to applications running on top of the
operating system. For example, instead of exposing the real interface of a device to an
application, a Unix kernel provides a representation of the device as a pseudo file in the
virtual file system. An application interacts with the device indirectly by operating on
the respective pseudo file via a device-class-specific API (ioctl operations). As another
example, a traditional OS kernel provides each application with an arbitrary amount of
virtual memory, which may be much larger than the available physical memory. The
application’s virtual memory is backed with physical memory not before the appli-
cation actually uses the memory. The pretension of unlimited memory by the kernel
relieves application developers from considering memory as a limited resource. On
the other hand, this convenient abstraction creates problems that are extremely hard or
even impossible to solve by the OS kernel.

• The amount of physical memory that is at the disposal for backing virtual mem-
ory is limited. Traditional OS kernels employ strategies to uphold the illusion of
unlimited memory by swapping memory pages to disk. However, the swap space
on disk is ultimately limited, too. At one point, when the physical resources are
exhausted, the pretension of unlimited memory becomes a leaky abstraction and
forces the kernel to take extreme decisions such as killing arbitrary processes to
free up physical memory.

• Multiple applications including critical applications as well as potentially misbe-


having applications share one pool of physical resources. In the presence of a

55
3.3 Resource trading

misbehaving application that exhausts the physical memory, all applications are
equally put at risk.

• Third, by granting each application the legitimate ability to consume as much


memory as the application desires, applications cannot be held accountable for
their consumption of physical memory. The kernel cannot distinguish a misbe-
having from a well-behaving memory-demanding application.

There are several approaches to relieve those problems. For example, OS kernels that
are optimized for resource utilization may employ heuristics that take the application
behavior into account for parametrizing page-swapping strategies. Another example
is the provisioning of a facility for pinned memory to applications. Such memory is
guaranteed to be backed by physical memory. But such a facility bears the risk of al-
lowing any application to exhaust physical memory directly. Hence, further heuris-
tics are needed to limit the amount of pinned memory an application may use. Those
counter measures and heuristics, while making the OS kernel more complex, are mere
attempts to fight symptoms but unable to solve the actual problems caused by the lack
of accounting. The behavior of such systems remains largely indeterministic.
As a further consequence of the abstraction from physical resources, the kernel has
to entail functionality to support the abstraction. For example, for swapping memory
pages to disk, the kernel has to depend on an in-kernel disk driver. For each application,
whether or not it ever touches the disk, the in-kernel disk driver is part of its trusted
computing base.

PD sessions and balances Genode does not abstract from physical resources. In-
stead, it solely arbitrates the access to such resources and provides means to delegate
the authority over resources between components. Low-level physical resources are
represented as services provided by the core component at the root of the component
tree. The core component is described in detail in Section 3.4. The following description
focuses on memory as the most prominent low-level resource managed by the operat-
ing system. Processing time is subject to the kernel’s scheduling policy whereas the
management of the higher-level resources such as disk space is left to the respective
servers that provide those resources.
Physical memory is handed out and accounted by the PD service of core. The best
way to describe the idea is to draw an analogy between the PD service and a bank. Each
PD session corresponds to a bank account. Initially, when opening a new account, there
is no balance. However, by having the authority over an existing bank account with a
balance, one can transfer funds from the existing account to the new account. Naturally,
such a transaction will decrease the balance of the originating account. Internally at the
bank, the transfer does not involve any physical bank notes. The transaction is merely a
change of balances of both bank accounts involved. A bank customer with the author-
ity over a given bank account can use the value stored on the bank account to purchase

56
3.3 Resource trading

Child

Init
 2 3

transf er(amount, 3)
Core
PD session PD session

Figure 12: Init assigns a portion of its memory to a child. In addition to its own PD session (2),
init has created a second PD session (3) designated for its child.

physical goods while withdrawing the costs from the account. Such a withdrawal will
naturally decrease the balance on the account. If the account is depleted, the bank de-
nies the purchase attempt. Analogously to purchasing physical goods by withdrawing
balances from a bank account, physical memory can be allocated from a PD session.
The balance of the PD session is the PD session’s quota. A piece of allocated physical
memory is represented by a so-called dataspace (see Section 3.4.1 for more details). A
RAM dataspace is a container of physical memory that can be used for storing data.

Subdivision of budgets Similar to a person with a bank account, each component of


a Genode system has a session at core’s PD service. At boot time, the core component
creates an initial PD session with the balance set to the amount of available physical
memory. This PD session is designated for the init component, which is the first and
only child of core. On request by init, core delegates the capability for this initial PD
session to the init component.
For each child component spawned by the init component, init creates a new PD
session at core. Figure 12 exemplifies this step for one child. As the result from the
session creation, it obtains the capability for the new PD session. Because it has the
authority over both its own and the child’s designated PD session, it can transfer a cer-
tain amount of RAM quota from its own account to the child’s account by invoking
its own PD-session capability and specifying the beneficiary’s PD-session capability as
argument. Core responds to the request by atomically adjusting the quotas of both
PD sessions by the specified amount. In the case of init, the amount depends on init’s
configuration. Thereby, init explicitly splits its own RAM budget among its child com-
ponents. Each child created by init can obtain the capability for its own PD session from
init via the parent interface and thereby gains the authority over the memory budget

57
3.3 Resource trading

Client

gate
dele

Server Child

Init

Core
PD session PD session PD session

Figure 13: Memory-stealing attempt

that was assigned to it. Note however, that no child has the authority over init’s PD
session nor the PD sessions of any siblings. The mechanism for distributing a given
budget among multiple children works recursively. The children of init can follow the
same procedure to further subdivide their budgets for spawning grandchildren.

Protection against resource stealing A parent that created a child subsystem out
of its own memory resources, expects to regain the spent resources when destructing
the subsystem. For this reason, it must not be possible for a child to transfer funds
to another branch of the component tree without the consent of the parent. Figure 13
illustrates an example scenario that violates this expectation. The client and server com-
ponents conspire to steal memory from the child. The client was created by the child
and received a portion of the child’s memory budget. The client requested a session
for a service that was eventually routed to the server. The client-server relationship
allows the client to delegate capabilities to the server. Therefore, it is able to delegate
its own PD session capability to the server. The server, now in possession of the client’s
and its own PD session capabilities, can transfer memory from the client’s to its own
PD session. After this transaction, the child has no way to regain its memory resources
because it has no authority over the server’s PD session.
To prevent such resource-stealing scenarios, Genode restricts the quota transfer be-
tween arbitrary PD sessions. Each PD session must have a reference PD session, which
can be defined only once. Transfers are permitted only between a PD session and its
reference PD session. When creating the PD session of a child component, the parent

58
3.3 Resource trading

registers its own PD session as the child’s reference PD session. This way, the parent
becomes able to transfer budgets between its own and the child’s PD session.

PD session destruction When a PD session is closed, core destroys all dataspaces


that were allocated from the PD session and transfers the PD session’s final budget to
the corresponding reference PD session.

3.3.2. Trading memory between clients and servers

An initial assignment of memory to a child is not always practical because the memory
demand of a given component may be unknown at its construction time. For example,
the memory needed by a GUI server over its lifetime is not known a priori but depends
on the number of its clients, the number of windows on screen, or the amount of pixels
that must be held at the server. In many cases, the memory usage of a server depends on
the behavior of its clients. In traditional operating systems, system services like a GUI
server would allocate memory on behalf of its clients. Even though the allocation was
induced by a client, the server performs the allocation. The OS kernel remains unaware
of the fact that the server solely needs the allocated memory for serving its client. In the
presence of a misbehaving client that issues an infinite amount of requests to the server
where each request triggers a server-side allocation (for example the creation of a new
window), the kernel will observe the server as a resource hog. Under resource pressure,
it will likely select the server to be punished. Each server that performs allocations on
behalf of its clients is prone to this kind of attack. Genode solves this problem by letting
clients pay for server-side allocations. Client and server may be arbitrary nodes in the
component tree.

Session quotas As described in the previous section, at the creation time of a child,
the parent assigns a part of its own memory quota to the new child. Since the parent
retains the PD-session capabilities of all its children, it can issue further quota trans-
fers back and forth between the children’s PD sessions and its own PD session, which
represents the reference account for all children. When a child requests a session at the
parent interface, it can attach a fraction of its quota to the new session by specifying an
amount of memory to be donated to the server as a session argument. This amount is
called session quota. The session quota can be used by the server during the lifetime of
the session. It is returned to the client when the session is closed.
When receiving a session request, the parent has to distinguish three different cases
depending on its session-routing decision as described in Section 3.2.3.

Parent provides the service If the parent provides the requested service by itself, it
first checks whether the session quota meets its need for providing the service. If
so, it transfers the session quota from the requesting child’s PD session to its own

59
3.3 Resource trading

PD session. This step may fail if the child offered a session quota larger than the
available quota in the child’s PD session.

Server is another child If the parent decides to route the session request to another
child, it transfers the session quota from the client’s PD session to the server’s
PD session. Because the PD sessions are not related to each other as both have
the parent’s PD session as reference account, this transfer from the client to the
server consists of two steps. First, the parent transfers the session quota to its
own PD session. If this step succeeded, it transfers the session quota from its own
PD session to the server’s PD session. The parent keeps track of the session quota
for each session so that the quota transfers can be reverted later when closing the
session. Not before the transfer of the session quota to the server’s PD session suc-
ceeded, the parent issues the actual session request at the server’s root interface
along with the information about the transferred session quota.

Forward to grandparent The parent may decide to forward the session request to its
own parent. In this case, the parent requests a session on behalf of its child. The
grandparent neither knows nor cares about the actual origin of the request and
will simply decrease the memory quota of the parent. For this reason, the parent
transfers the session quota from the requesting child to its own PD session before
issuing the session request at the grandparent.

Quota transfers may fail if there is not enough budget on the originating account. In
this case, the parent aborts the session creation and reflects the lack of resources as an
error to the originator of the session request.
This procedure works recursively. Once the server receives the session request along
with the information about the provided session quota, it can use this information to
decide whether or not to provide the session under these resource conditions. It can
also use the information to tailor the quality of the service according to the provided
session quota. For example, a larger session quota might enable the server to use larger
caches or communication buffers for the client’s session.

Session upgrades During the lifetime of a session, the initial session quota may turn
out to be too scarce. Usually, the server returns such a scarcity condition as an error of
operations that imply server-side allocations. The client may handle such a condition
by upgrading the session quota of an existing session by issuing an upgrade request to
its parent along with the targeted session capability and the additional session quota.
The upgrade works analogously to the session creation. The server will receive the
information about the upgrade via the root interface of the service.

Closing sessions If a child issues a session-close request to its parent, the parent
determines the corresponding server, which, depending on the route of the original

60
3.3 Resource trading

session request, may be locally implemented, provided by another child, or provided


by the grandparent. Once the server receives the session-close request, it is responsi-
ble for releasing all resources that were allocated from the session quota. The release
of resources should revert all allocations the server has performed on behalf its client.
Stressing the analogy with the bank account, the server has to sell the physical goods
(i. e., RAM dataspaces) it purchased from the client’s session quota to restore the bal-
ance on its PD session. After the server has reverted all session-specific allocations, the
server’s PD session is expected to have at least as much available budget as the session
quota of the to-be-closed session. As a result, the session quota can be transferred back
to the client.
However, a misbehaving server may fail to release those resources by malice or be-
cause of a bug. For example, the server may be unable to free a dataspace because it
mistakenly used the dataspace for another client’s data. Another example would be a
memory leak in the server. Such misbehavior is detected on the attempt to withdraw
the session quota from the server’s PD session. If the server’s available RAM quota
after closing a session remains lower than the session quota, the server apparently pec-
ulated memory. If the misbehaving server was locally provided by the parent, it has the
full authority to not hand back the session quota to its child. If the misbehaving service
was provided by the grandparent, the parent (and its whole subsystem) has to subor-
dinate. If, however, the server was provided by another child and the child refuses to
release resources, the parent’s attempt to withdraw the session quota from the server’s
PD session will fail. It is up to the policy of the parent to handle such a failure either by
punishing the server (e. g., killing the component) or by granting more of its own quota.
Generally, misbehavior is against the server’s own interests. A server’s best interest is
to obey the parent’s close request to avoid intervention.

3.3.3. Component-local heap partitioning

Components that perform memory allocations on behalf of untrusted parties must


take special precautions for the component-local memory management. There are two
prominent examples for such components. As discussed in Section 3.3.2, a server may
be used by multiple clients that must not interfere with each other. Therefore, server-
side memory allocations on behalf of a particular client must strictly be accounted to
the client’s session quota. Second, a parent with multiple children may need to allocate
memory to perform the book keeping for the individual children, for example, main-
taining the information about their open sessions and their session quotas. The parent
should account those child-specific allocations to the respective children. In both cases,
it is not sufficient to merely keep track of the amount of memory consumed on behalf
of each untrusted party but the actual allocations must be performed on independent
backing stores.
Figure 14 shows a scenario where a server performs anonymous memory allocations
on behalf of two session. The memory is allocated from the server’s heap. Whereas allo-

61
3.3 Resource trading

Server
Heap

Dataspace Dataspace Dataspace

Session Session

Figure 14: A server allocates anonymous memory on behalf of multiple clients from a single
heap.

Server
Heap Heap

Dataspace Dataspace Dataspace

Session Session

Figure 15: A server performs memory allocations from session-specific heap partitions.

cations from the heap are of byte granularity, the heap’s backing store consists of several
dataspaces. Those dataspaces are allocated from the server’s PD session as needed but
at a much larger granularity. As depicted in the figure, allocations from both sessions
end up in the same dataspaces. This becomes a problem once one session is closed. As
described in the previous section, the server’s parent expects the server to release all
resources that were allocated from the corresponding session quota. However, even if
the server reverts all heap allocations that belong to the to-be-closed session, the server
could still not release the underlying backing store because all dataspaces are still oc-
cupied with memory objects of another session. Therefore, the server becomes unable
to comply with the parent’s expectation.
The solution of this problem is illustrated in Figure 15. For each session, the server
maintains a separate heap partition. Each memory allocation on behalf of a client is
performed from the session-specific heap partition rather than from a global heap. This

62
3.3 Resource trading

way, memory objects of different sessions populate disjoint dataspaces. When clos-
ing a session, the server reverts all memory allocations from the session’s heap. After
freeing the session’s memory objects, the heap partition becomes empty. So it can be
destroyed. By destroying the heap partition, the underlying dataspaces that were used
as the backing store can be properly released.

3.3.4. Dynamic resource balancing

As described in Section 6.2.2, parent components explicitly assign physical resource


budgets to their children. Once assigned, the budget is at the disposal of the respective
child subsystem until the subsystem gets destroyed by the parent.
However, not all components have well-defined resource demands. For example, a
block cache should utilize as much memory as possible unless the memory is needed
by another component. The assignment of fixed amount of memory to such a block
cache cannot accommodate changes of workloads over the potentially long lifetime of
the component. If dimensioned too small, there may be a lot of slack memory remain-
ing unutilized. If dimensioned too large, the block cache would prevent other and
possibly more important components to use the memory. A better alternative is to en-
able a component to adapt its resource use to the resource constraints of its parent. The
parent interface supports this alternative with a protocol for the dynamic balancing of
resources.
The resource-balancing protocol uses a combination of synchronous remote proce-
dure calls and asynchronous notifications. Both mechanisms are described in Section
3.6. The child uses remote procedure calls to talk to its parent whereas the parent uses
asynchronous notifications to signal state changes to the child. The protocol consists of
two parts, which are complementary.

Resource requests By issuing a resource request to its parent, a child applies for
an upgrade of its resources. The request takes the amount of desired resources as ar-
gument. A child would issue such a request if it detects scarceness of resources. A
resource request returns immediately regardless of whether additional resources have
been granted or not. The child may proceed working under the low resource conditions
or it may block and wait for a resource-available signal from its parent. The parent
may respond to this request in different ways. It may just ignore the request, possibly
stalling the child. Alternatively, it may immediately transfer additional quota to the
child’s PD session. Or it may take further actions to free up resources to accommodate
the child. Those actions may involve long-taking operations such as the destruction
of subsystems or the further propagation of resource request towards the root of the
component tree. Once the parent has freed up enough resources to accommodate the
child’s request, it transfers the new resources to the child’s PD session and notifies the
child by sending a resource-available signal.

63
3.3 Resource trading

Yield requests The second part of the protocol enables the parent to express its wish
for regaining resources. The parent notifies the child about this condition by sending
a yield signal to the child. On the reception of such a signal, the child picks up the
so-called yield request at the parent using a remote procedure call. The yield request
contains the amount of resources the parent wishes to regain. It is up to the child to
comply with a yield request or not. Some subsystems have meaningful ways to re-
spond to yield requests. For example, an in-memory block cache could write back the
cached information and release the memory consumed by the cache. Once the child has
succeeded in freeing up resources, it reports to the parent by issuing a so-called yield
response via a remote procedure call to the parent. The parent may respond to a yield
response by withdrawing resources from the child’s PD session.

64
3.4 Core - the root of the component tree

3.4. Core - the root of the component tree

Core is the first user-level component, which is directly created by the kernel. It thereby
represents the root of the component tree. It has access to the raw physical resources
such as memory, CPUs, memory-mapped devices, interrupts, I/O ports, and boot mod-
ules. Core exposes those low-level resources as services so that they can be used by
other components. For example, physical memory is made available as so-called RAM
dataspaces allocated from core’s PD service, interrupts are represented by the IRQ ser-
vice, and CPUs are represented by the CPU service. In order to access a resource, a
component has to establish a session to the corresponding service. Thereby the access
to physical resources is subjected to the routing of session requests as explained in Sec-
tion 3.2.3. Moreover, the resource-trading concept described in Section 3.3.2 applies to
core services in the same way as for any other service.
In addition to making hardware resources available as services, core provides all
prerequisites to bootstrap the component tree. These prerequisites comprise services
for creating protection domains, for managing address-space layouts, and for creating
object identities.
Core is almost free from policy. There are no configuration options. The only policy
of core is the startup of the init component, to which core grants all available resources.
Init, in turn, uses those resources to spawn further components according to its config-
uration.
Section 3.4.1 introduces dataspaces as containers of memory or memory-like re-
sources. Dataspaces form the foundation for most of the core services described in the
subsequent sections. The section is followed by the introduction of each individual
service provided by core. In the following, a component that has established a session
to such a service is called client. For example, a component that obtained a session to
core’s CPU service is a CPU client.

3.4.1. Dataspaces
1
A dataspace is an RPC object that resides in core and represents a contiguous physical
address-space region with an arbitrary size. Its base address and size are subjected to
the granularity of physical pages as dictated by the memory-management unit (MMU)
hardware. Typically the granularity is 4 KiB.
Dataspaces are created and managed via core’s services. Because each dataspace is
a distinct RPC object, the authority over the contained physical address range is repre-
sented by a capability and can thereby be delegated between components. Each com-
ponent in possession of a dataspace capability can make the dataspace content visible
in its local address space. Hence, by the means of delegating dataspace capabilities,
components can establish shared memory.
1
Interface specification in Section 8.1.3

65
3.4 Core - the root of the component tree

On Genode, only core deals with physical memory pages. All other components use
dataspaces as a uniform abstraction for memory, memory-mapped I/O regions, and
ROM modules.

3.4.2. Region maps


1
A region map represents the layout of a virtual address space. The size of the virtual
address space is defined at its creation time. Region maps are created implicitly as part
of a PD session (Section 3.4.4) or manually via the RM service (Section 3.4.5).

Populating an address space The concept behind region maps is a generalization of


the MMU’s page-table mechanism. Analogously to how a page table is populated with
physical page frames, a region map is populated with dataspaces. Under the hood, core
uses the MMU’s page-table mechanism as a cache for region maps. The exact way of
how MMU translations are installed depends on the underlying kernel and is opaque
to Genode components. On most base platforms, memory mappings are established in
a lazy fashion by core’s page-fault resolution mechanism described in Section 7.3.4.
A region-map client in possession of a dataspace capability is able to attach the datas-
pace to the region map. Thereby the content of the dataspace becomes visible within
the region map’s virtual address space. When attaching a dataspace to a region map,
core selects an appropriate virtual address range that is not yet populated with datas-
paces. Alternatively, the client can specify a designated virtual address. It also has the
option to attach a mere window of the dataspace to the region map. Furthermore, the
client can specify whether the content of the dataspace should be executable or not.
The counterpart of the attach operation is the detach operation, which enables the
region-map client to remove dataspaces from the region map by specifying a virtual ad-
dress. Under the hood, this operation flushes the MMU mappings of the corresponding
virtual address range so that the dataspace content becomes invisible.
Note that a single dataspace may be attached to any number of region maps. A
dataspace may also be attached multiple times to one region map. In this case, each
attach operation populates a distinct region of the virtual address space.

3.4.3. Access to boot modules (ROM)

During the initial bootstrap phase of the machine, a boot loader loads the kernel’s bi-
nary and additional chunks of data called boot modules into the physical memory. After
those preparations, the boot loader passes control to the kernel. Examples of boot mod-
ules are the ELF images of the core component, the init component, the components
created by init, and the configuration of the init component. Core makes each boot
1
Interface specification in Section 8.4

66
3.4 Core - the root of the component tree

1
module available as a ROM session . Because boot modules are read-only memory,
they are generally called ROM modules. On session construction, the client specifies
the name of the ROM module as session argument. Once created, the ROM session
allows its client to obtain a ROM dataspace capability. Using this capability, the client
can make the ROM module visible within its local address space. The ROM session
interface is described in more detail in Section 4.5.1.

3.4.4. Protection domains (PD)

A protection domain (PD) corresponds to a unit of protection within the Genode sys-
tem. Typically, there is a one-to-one relationship between a component and a PD ses-
2
sion . Each PD consists of a virtual memory address space, a capability space (Section
3.1.1), and a budget of physical memory and capabilities. Core’s PD service also plays
the role of a broker for asynchronous notifications on kernels that lack the semantics of
Genode’s signalling API.

Physical memory and capability allocation Each PD session contains quota-bounded


allocators for physical memory and capabilities. At session-creation time, its quota is
zero. To make an allocator functional, it must first receive quota from another already
existing PD session, which is called the reference account. Once the reference account is
defined, quota can be transferred back and forth between the reference account and the
new PD session.
Provided that the PD session is equipped with sufficient quota, the PD client can
allocate RAM dataspaces from the PD session. The size of each RAM dataspace is
defined by the client at the time of allocation. The location of the dataspace in physical
memory is defined by core. Each RAM dataspace is physically contiguous and can
thereby be used as DMA buffer by a user-level device driver. In order to set up DMA
transactions, such a device driver can request the physical address of a RAM dataspace
by invoking the dataspace capability.
Closing a PD session destroys all dataspaces allocated from the PD session and re-
stores the original quota. This implies that these dataspaces disappear in all compo-
nents. The quota of a closed PD session is transferred to the reference account.

Virtual memory and capability space At the hardware-level, the CPU isolates dif-
ferent virtual memory address spaces via a memory-management unit. Each domain is
represented by a different page directory, or an address-space ID (ASID). Genode pro-
vides an abstraction from the underlying hardware mechanism in the form of region
maps as introduced in Section 3.4.2. Each PD is readily equipped with three region
maps. The address space represents the layout of the PD’s virtual memory address space,
1
Interface specification in Section 8.5.2
2
Interface specification in Section 8.5.1

67
3.4 Core - the root of the component tree

the stack area represents the portion of the PD’s virtual address space where stacks are
located, and the linker area is designated for dynamically linked shared objects. The
stack area and linker area are attached to the address space at the component initialisa-
tion time.
The capability space is provided as a kernel mechanism. Note that not all kernels pro-
vide equally good mechanisms to implement Genode’s capability model as described
in Section 3.1. On kernels with support for kernel-protected object capabilities, the PD
session interface allows components to create and manage kernel-protected capabili-
ties. Initially, the PD’s capability space is empty. However, the PD client can install a
single capability - the parent capability - using the assign-parent operation at the creation
time of the PD.

3.4.5. Region-map management (RM)

As explained in Section 3.4.4, each PD session is equipped with three region maps by
default. The RM service allows components to create additional region maps manually.
Such manually created region maps are also referred to as managed dataspaces. A man-
aged dataspace is not backed by a range of physical addresses but its content is defined
by its underlying region map. This makes region maps a generalization of nested page
tables. A region-map client can obtain a dataspace capability for a given region map
and use this dataspace capability in the same way as any other dataspace capability,
i. e., attaching it to its local address space, or delegating it to other components.
Managed dataspaces are used in two ways. First, they allow for the manual man-
agement of portions of a component’s virtual address space. For example, the so-called
stack area of a protection domain is a dedicated virtual-address range preserved for
stacks. Between the stacks, the virtual address space must remain empty so that stack
overflows won’t silently corrupt data. This is achieved by using a dedicated region map
that represents the complete stack area. This region map is attached as a dataspace to
the component’s virtual address space. When creating a new thread along with its cor-
responding stack, the thread’s stack is not directly attached to the component’s address
space but to the stack area’s region map. Another example is the virtual-address range
managed by a dynamic linker to load shared libraries into.
The second use of managed dataspaces is the provision of on-demand-populated
dataspaces. A server may hand out dataspace capabilities that are backed by region
maps to its clients. Once the client has attached such a dataspace to its address space
and touches it’s content, the client triggers a page fault. Core responds to this page
fault by blocking the client thread and delivering a notification to the server that created
the managed dataspace along with the information about the fault address within the
region map. The server can resolve this condition by attaching a dataspace with real
backing store at the fault address, which prompts core to resume the execution of the
faulted thread.

68
3.4 Core - the root of the component tree

3.4.6. Processing-time allocation (CPU)


1
A CPU session is an allocator for processing time that allows for the creation, the con-
trol, and the destruction of threads of execution. At session-construction time, the affin-
ity of a CPU session with CPU cores can be defined via session arguments.
Once created, the session can be used to create, control, and kill threads. Each thread
created via a CPU session is represented by a thread capability. The thread capability
is used for subsequent thread-control operations. The most prominent thread-control
operation is the start of the thread, which takes the thread’s initial stack pointer and
instruction pointer as arguments.
During the lifetime of a thread, the CPU client can retrieve and manipulate the state
of the thread. This includes the register state as well as the execution state (whether the
thread is paused or running). Those operations are primarily designated for realizing
user-level debuggers.

3.4.7. Access to device resources (IO_MEM, IO_PORT, IRQ)

Core’s IO_MEM, IO_PORT, and IRQ services enable the realization of user-level device
drivers as Genode components.

2
Memory mapped I/O (IO_MEM) An IO_MEM session provides a dataspace rep-
resentation for a non-memory part of the physical address space such as memory-
mapped I/O regions or BIOS areas. In contrast to a memory block that is used for
storing information, of which the physical location in memory is of no concern, a non-
memory object has special semantics attached to its location within the physical ad-
dress space. Its location is either fixed (by standard) or can be determined at runtime,
for example by scanning the PCI bus for PCI resources. If the physical location of such
a non-memory object is known, an IO_MEM session can be created by specifying the
physical base address, the size, and the write-combining policy of the memory-mapped
resource as session arguments. Once an IO_MEM session is created, the IO_MEM client
can request a dataspace containing the specified physical address range.
Core hands out each physical address range only once. Session requests for ranges
that intersect with physical memory are denied. Even though the granularity of mem-
ory protection is limited by the MMU page size, the IO_MEM service accepts the spec-
ification of the physical base address and size at the granularity of bytes. The rationale
behind this contradiction is the unfortunate existence of platforms that host memory-
mapped resources of unrelated devices on the same physical page. When driving such
devices from different components, each of those components requires access to its
1
Interface specification in Section 8.5.4
2
Interface specification in Section 8.5.5

69
3.4 Core - the root of the component tree

corresponding device. So the same physical page must be handed out to multiple com-
ponents. Of course, those components must be trusted to not touch any portion of the
page that is unrelated to its own device.

Port I/O (IO_PORT) For platforms that rely on I/O ports for device access, core’s
IO_PORT service enables the fine-grained assignment of port ranges to individual com-
1
ponents. Each IO_PORT session corresponds to the exclusive access right to a port
range specified as session arguments. Core creates the new IO_PORT session only if
the specified port range does not overlap with an already existing session. This en-
sures that each I/O port is driven by only one IO_PORT client at a time. The IO_PORT
session interface resembles the physical I/O port access instructions. Reading from an
I/O port can be performed via an 8-bit, 16-bit, or 32-bit access. Vice versa, there ex-
ist operations for writing to an I/O port via an 8-bit, 16-bit, or 32-bit access. The read
and write operations take absolute port addresses as arguments. Core performs the
I/O-port operation only if the specified port address lies within the port range of the
session.

Reception of device interrupts (IRQ) Core’s IRQ service enables device-driver com-
2
ponents to respond to device interrupts. Each IRQ session corresponds to an interrupt.
The physical interrupt number is specified as session argument. Each physical inter-
rupt number can be specified by only one session. The IRQ session interface provides
an operation to wait for the next interrupt. Only while the IRQ client is waiting for
an interrupt, core unmasks the interrupt at the interrupt controller. Once the interrupt
occurs, core wakes up the IRQ client and masks the interrupt at the interrupt controller
until the driver has acknowledged the completion of the IRQ handling and waits for
the next interrupt.

3.4.8. Logging (LOG)

The LOG service is used by the lowest-level system components such as the init com-
3
ponent for printing diagnostic output. Each LOG session takes a label as session ar-
gument, which is used to prefix the output of this session. This enables developers to
distinguish the output of different components with each component having a unique
label. The LOG client transfers the to-be-printed characters as payload of plain RPC
messages, which represents the simplest possible communication mechanism between
the LOG client and core’s LOG service.
1
Interface specification in Section 8.5.6
2
Interface specification in Section 8.5.7
3
Interface specification in Section 8.5.8

70
3.4 Core - the root of the component tree

3.4.9. Event tracing (TRACE)

The TRACE service provides a light-weight event-tracing facility. It is not fundamental


to the architecture. However, as the service allows for the inspection and manipula-
tion of arbitrary threads of a Genode system, TRACE sessions must not be granted to
untrusted components.

71
3.5 Component creation

Parent
virtual memory

environment

Core
Region map

PD session

Figure 16: Starting point for creating a new component

3.5. Component creation

Each Genode component is made out of three basic ingredients:


PD session representing the component’s protection domain

ROM session with the executable binary

CPU session for creating the initial thread of the component


It is the responsibility of the new component’s parent to obtain those sessions. The
initial situation of the parent is depicted in Figure 16. The parent’s memory budget
is represented by the parent’s PD (Section 3.4.4) session. The parent’s virtual address
space is represented by the region map contained in the parent’s PD session. The par-
ent’s PD session was originally created at the parent’s construction time. Along with
the parent’s CPU session, it forms the parent’s so-called environment. The address space
is populated with the parent’s code (shown as red), the so-called stack area that hosts
the stacks (shown as blue), and presumably several RAM dataspaces for the heap, the
DATA segment, and the BSS segment. Those are shown as yellow.

3.5.1. Obtaining the child’s ROM and PD sessions

The first step for creating a child component is obtaining the component’s executable
binary, e. g., by creating a session to a ROM service such as the one provided by core
(Section 3.4.3). With the ROM session created, the parent can make the dataspace with
the executable binary (i. e., an ELF binary) visible within its virtual address space by
attaching the dataspace to its PD’s region map. After this step, the parent is able to
inspect the ELF header to determine the memory requirements for the binary’s DATA
and BSS segments.
The next step is the creation of the child’s designated PD session, which holds the
memory and capability budgets the child will have at its disposal. The freshly created

72
3.5 Component creation

Parent
virtual memory

 ref account, transfer quota


attach

Core
Region map ROM session

PD session Dataspace PD session

Figure 17: The parent creates the PD session of the new child and obtains the child’s executable

PD session has no budget though. In order to make the PD session usable, the parent
has to transfer a portion of its own RAM quota to the child’s PD session. As explained
in Section 6.2.2, the parent registers its own PD session as the reference account for the
child’s PD session in order to become able to transfer quota back and forth between
both PD sessions. Figure 17 shows the situation.

3.5.2. Constructing the child’s address space

With the child’s PD session equipped with a memory, the parent can construct the ad-
dress space for the new child and populate it with memory allocated from the child’s
budget (Figure 18). The address-space layout is represented as a region map that is part
of each PD session (Section 3.4.4). The first page of the address space is excluded such
that any attempt by the child to de-reference a null pointer will cause a fault instead of
silently corrupting memory. After its creation time, the child’s region map is empty. It
is up to the parent to populate the virtual address space with meaningful information
by attaching dataspaces to the region map. The parent performs this procedure based
on the information found in the ELF executable’s header:

Read-only segments For each read-only segment of the ELF binary, the parent at-
taches the corresponding portion of the ELF dataspace to the child’s address space
by invoking the attach operation on the child’s region-map capability. By attach-
ing a portion of the existing ELF dataspace to the new child’s region map, no
memory must be copied. If multiple instances of the same executable are created,
the read-only segments of all instances refer to the same physical memory pages.
If the segment contains the TEXT segment (the program code), the parent speci-
fies a so-called executable flag to the attach operation. Core passes this flag to the

73
3.5 Component creation

Parent
virtual memory

 attach
alloc

Core
ROM session PD session

Dataspace RAM Dataspace Region map

Figure 18: The parent creates and populates the virtual address space of the child using a new
PD session (the parent’s PD session is not depicted for brevity)

respective kernel such that the corresponding page-table entries for the new com-
ponents will be configured accordingly (by setting or clearing the non-executable
bit in the page-table entries). Note that the propagation of this information (or
the lack thereof) depends on the kernel used. Also note that not all hardware
platforms distinguish executable from non-executable memory mappings.

Read-writeable segments In contrast to read-only segments, read-writeable seg-


ments cannot be shared between components. Hence, each read-writeable seg-
ment must be backed with a distinct copy of the segment data. The parent
allocates the backing store for the copy from the child’s PD session and thereby
accounts the memory consumption on behalf of the child to the child’s budget.
For each segment, the parent performs the following steps:
1. Allocation of a RAM dataspace from the child’s PD session. The size of the
dataspace corresponds to the segment’s memory size. The memory size may
be higher than the size of the segment in the ELF binary (named file size). In
particular, if the segment contains a DATA section followed by a BSS section,
the file size corresponds to the size of the DATA section whereby the memory
size corresponds to the sum of both sections. Core’s PD service ensures that
each freshly allocated RAM dataspace is guaranteed to contain zeros. Core’s
PD service returns a RAM dataspace capability as the result of the allocation
operation.

74
3.5 Component creation

2. Attachment of the RAM dataspace to the parent’s virtual address space by


invoking the attach operation on the parent’s region map with the RAM
dataspace capability as argument.
3. Copying of the segment content from the ELF binary’s dataspace to the
freshly allocated RAM dataspace. If the memory size of the segment is
larger than the file size, no special precautions are needed as the remainder
of the RAM dataspace is known to be initialized with zeros.
4. After filling the content of the segment dataspace, the parent no longer needs
to access it. It can remove it from its virtual address space by invoking the
detach operation on its own region map.
5. Based on the virtual segment address as found in the ELF header, the parent
attaches the RAM dataspace to the child’s virtual address space by invoking
the attach operation on the child PD’s region map with the RAM dataspace
as argument.

This procedure is repeated for each segment. Note that although the above descrip-
tion refers to ELF executables, the underlying mechanisms used to load the executable
binary are file-format agnostic.

3.5.3. Creating the initial thread

With the virtual address space of the child configured, it is time to create the compo-
nent’s initial thread. Analogously to the child’s PD session, the parent creates a CPU
session (Section 3.4.6) for the child. The parent may use session arguments to constrain
the scheduling parameters (i. e., the priority) and the CPU affinity of the new child.
Whichever session arguments are specified, the child’s abilities will never exceed the
parent’s abilities. I.e., the child’s priority is subjected to the parent’s priority constraints.
Once constructed, the CPU session can be used to create new threads by invoking the
session’s create-thread operation with the thread’s designated PD as argument. Based
on this association of the thread with its PD, core is able to respond to page faults trig-
gered by the thread. The invocation of this operation results in a thread capability,
which can be used to control the execution of the thread. Immediately after its creation,
the thread remains inactive. In order to be executable, it first needs to be configured.
As described in Section 3.2.1, each PD has initially a single capability installed, which
allows the child to communicate with its parent. Right after the creation of the PD for
a new child, the parent can register a capability to a locally implemented RPC object
as parent capability for the PD session. Now that the child’s PD is equipped with an
initial thread and a communication channel to its parent, it is the right time to kick off
the execution of the thread by invoking the start operation on its thread capability. The
start operation takes the initial program counter as argument, which corresponds to
the program’s entry-point address as found in the ELF header of the child’s executable

75
3.5 Component creation

Child

Parent
virtual memory

 create thread
Parent

Core
PD session Region map

CPU session

Figure 19: Creation of the child’s initial thread

binary. Figure 19 illustrates the relationship between the PD session, the CPU session,
and the parent capability. Note that neither the ROM dataspace containing the ELF
binary nor the RAM dataspaces allocated during the ELF loading are visible in the
parent’s virtual address space any longer. After the initial loading of the ELF binary,
the parent has detached those dataspaces from its own region map.
The child starts its execution at the virtual address defined by the ELF entrypoint. It
points to a short assembly routine that sets up the initial stack and calls the low-level
C++ startup code. This code, in turn, initializes the C++ runtime (such as the exception
handling) along with the component’s local Genode environment. The environment is
constructed by successively requesting the component’s CPU and PD sessions from its
parent. With the Genode environment in place, the startup code initializes the stack
area, sets up the real stack for the initial thread within the stack area, and returns to
the assembly startup code. The assembly code, in turn, switches the stack from the
initial stack to the real stack and calls the program-specific C++ startup code. This code
initializes the component’s initial entrypoint and executes all global constructors before
calling the component’s construct function. Section 7.1 describes the component-local
startup procedure in detail.

76
3.6 Inter-component communication

3.6. Inter-component communication

Genode provides three principle mechanisms for inter-component communication,


namely synchronous remote procedure calls (RPC), asynchronous notifications, and
shared memory. Section 3.6.1 describes synchronous RPC as the most prominent one.
In addition to transferring information across component boundaries, the RPC mecha-
nism provides the means for delegating capabilities and thereby authority throughout
the system.
The RPC mechanism closely resembles the semantics of a function call where the con-
trol is transferred from the caller to the callee until the function returns. As discussed
in Section 3.2.4, there are situations where the provider of information does not wish to
depend on the recipient to return control. Such situations are addressed by the means
of an asynchronous notification mechanism explained in Section 3.6.2.
Neither synchronous RPC nor asynchronous notifications are suitable for transfer-
ring large bulks of information between components. RPC messages are strictly bound
to a small size and asynchronous notifications do not carry any payload at all. This
is where shared memory comes into play. By sharing memory between components,
large bulks of information can be propagated without the active participation of the
kernel. Section 3.6.3 explains the procedure of establishing shared memory between
components.
Each of the three basic mechanisms is rarely found in isolation. Most inter-component
interactions are a combination of these mechanisms. Section 3.6.4 introduces a pattern
for propagating state information by combining asynchronous notifications with RPC.
Section 3.6.5 shows how synchronous RPC can be combined with shared memory
to transfer large bulks of information in a synchronous way. Section 3.6.6 combines
asynchronous notifications with shared memory to largely decouple producers and
consumers of high-throughput data streams.

77
3.6 Inter-component communication

Connection Session

Client object RPC object

RPC stub code RPC stub code typed

IPC library IPC library untyped

Kernel IPC mechanism

Figure 20: Layered architecture of the RPC mechanism

3.6.1. Synchronous remote procedure calls (RPC)

Section 3.1.3 introduced remote procedure calls (RPC) as Genode’s fundamental mech-
anism to delegate authority between components. It introduced the terminology for
RPC objects, capabilities, object identities, and entrypoints. It also outlined the flow of
control between a client, the kernel, and a server during an RPC call. This section com-
plements Section 3.1.3 with the information of how the mechanism presents itself at the
C++ language level. It first introduces the layered structure of the RPC mechanism and
the notion of typed capabilities. After presenting the class structure of an RPC server, it
shows how those classes interact when RPC objects are created and called.

Typed capabilities Figure 20 depicts the software layers of the RPC mechanism.

Kernel inter-process-communication (IPC) mechanism At the lowest level, the


kernel’s IPC mechanism is used to transfer messages back and forth between
client and server. The actual mechanism largely differs between the various ker-
nels supported by Genode. Chapter 7 gives insights into the functioning of the
IPC mechanism as used on specific kernels. Genode’s capability-based security
model is based on the presumption that the kernel protects object identities as
kernel objects, allows user-level components to refer to kernel objects via capabil-
ities, and supports the delegation of capabilities between components using the
kernel’s IPC mechanism. At the kernel-interface level, the kernel is not aware of
language semantics like the C++ type system. From the kernel’s point of view, an
object identity merely exists and can be referred to, but has no type.

IPC library The IPC library introduces a kernel-independent programming interface


that is needed to implement the principle semantics of clients and servers. For
each kernel supported by Genode, there exists a distinct IPC library that uses the
respective kernel mechanism. The IPC library introduces the notions of untyped
capabilities, message buffers, IPC clients, and IPC servers.

78
3.6 Inter-component communication

An untyped capability is the representation of a Genode capability at the C++ lan-


guage level. It consists of the local name of the referred-to object identity as well
as a means to manage the lifetime of the capability, i. e., a reference counter. The
exact representation of an untyped capability depends on the kernel used.
A message buffer is a statically sized buffer that carries the payload of an IPC mes-
sage. It distinguishes two types of payload, namely raw data and capabilities.
Payloads of both kinds can be simultaneously present. A message buffer can
carry up to 1 KiB of raw data and up to four capabilities. Prior to issuing the
kernel IPC operation, the IPC library translates the message-buffer content to the
format understood by the kernel’s IPC operation.
The client side of the communication channel executes an IPC call operation with
a destination capability, a send buffer, and a receive buffer as arguments. The
send buffer contains the RPC function arguments, which can comprise plain data
as well as capabilities. The IPC library transfers these arguments to the server
via a platform-specific kernel operation and waits for the server’s response. The
response is returned to the caller as new content of the receive buffer.
At the server side of the communication channel, an entrypoint thread executes
the IPC reply and IPC reply-and-wait operations to interact with potentially many
clients. Analogously to the client, it uses two message buffers, a receive buffer for
incoming requests and a send buffer for delivering the reply of the last request.
For each entrypoint, there exists an associated untyped capability that is created
with the entrypoint. This capability and can be combined with an IPC client ob-
ject to perform calls to the server. The IPC reply-and-wait operation delivers the
content of the reply buffer to the last caller and then waits for a new request using
a platform-specific kernel operation. Once unblocked by the kernel, it returns the
arguments for the new request in the request buffer. The server does not obtain
any form of client identification along with an incoming message that could be
used to implement server-side access-control policies. Instead of performing ac-
cess control based on a client identification in the server, access control is solely
performed by the kernel on the invocation of capabilities. If a request was deliv-
ered to the server, the client has – by definition – a capability for communicating
with the server and thereby the authority to perform the request.

RPC stub code The RPC stub code complements the IPC library with the semantics
of RPC interfaces and RPC functions. An RPC interface is an abstract C++ class
with the declarations of the functions callable by RPC clients. Thereby each RPC
interface is represented as a C++ type. The declarations are accompanied with
annotations that allow the C++ compiler to generate the so-called RPC stub code
on both the client side and server side. Genode uses C++ templates to generate
the stub code, which avoids the crossing of a language barrier when designing
RPC interfaces and alleviates the need for code-generating tools in addition to

79
3.6 Inter-component communication

the compiler.
The client-side stub code translates C++ method calls to IPC-library operations.
Each RPC function of an RPC interface has an associated opcode (according to
the order of RPC functions). This opcode along with the method arguments are
inserted into the IPC client’s send buffer. Vice versa, the stub code translates the
content of the IPC client’s receive buffer to return values of the method invocation.
The server-side stub code implements the so-called dispatch function, which
takes the IPC server’s receive buffer, translates the message into a proper C++
method call, calls the corresponding server-side function of the RPC interface,
and translates the function results into the IPC server’s send buffer.

RPC object and client object Thanks to the RPC stub code, the server-side imple-
mentation of an RPC object comes down to the implementation of the abstract
interface of the corresponding RPC interface. When an RPC object is associated
with an entrypoint, the entrypoint creates a unique capability for the given RPC
object. RPC objects are typed with their corresponding RPC interface. This C++
type information is propagated to the corresponding capabilities. For example,
when associating an RPC object that implements the LOG-session interface with
an entrypoint, the resulting capability is a LOG-session capability.
This capability represents the authority to invoke the functions of the RPC object.
On the client side, the client object plays the role of a proxy of the RPC object
within the client’s component. Thereby, the client becomes able to interact with
the RPC object in a natural manner.

Sessions and connections Section 3.2.3 introduced sessions between client and
server components as the basic building blocks of system composition. At the
server side each session is represented by an RPC object that implements the
session interface. At the client side, an open session is represented by a connec-
tion object. The connection object encapsulates the session arguments and also
represents a client object to interact with the session.

As depicted in Figure 20, capabilities are associated with types on all levels above the
IPC library. Because the IPC library is solely used by the RPC stub code but not at the
framework’s API level, capabilities appear as being C++ type safe, even across com-
ponent boundaries. Each RPC interface implicitly defines a corresponding capability
type. Figure 21 shows the inheritance graph of Genode’s most fundamental capability
types.

Server-side class structure Figure 22 gives on overview of the C++ classes that are
involved at the server side of the RPC mechanism. As described in Section 3.1.3, each
entrypoint maintains a so-called object pool. The object pool contains references to
RPC objects associated with the entrypoint. When receiving an RPC request along with

80
3.6 Inter-component communication

Capability

Signal
Session Dataspace Thread Parent Root
context

I/O
RAM ROM
mem

CPU IRQ PD RM ROM

I/O I/O
TRACE LOG
port mem

Figure 21: Fundamental capability types

T associates
Object-pool entry * 1
Object pool capabilities with
capability local objects
loo
k up

Entrypoint
RPC-object base
call
manage(RPC object base &) : Capability
dispatch()
dissolve(RPC object base &)

RPC interface receives


RPC object incoming RPC
requests

dispatch()

Figure 22: Server-side structure of the RPC mechanism

81
3.6 Inter-component communication

the local name of the invoked object identity, the entrypoint uses the object pool to
lookup the corresponding RPC object. As seen in the figure, the RPC object is a class
template parametrized with its RPC interface. When instantiated, the dispatch function
is generated by the C++ compiler according to the RPC interface.

82
3.6 Inter-component communication

Client Server Core


Object PD
Entrypoint
pool
create RPC object
a
manage a

allocate capability

return capability capa

associate
a with capa

return
capa
delegate
capa

Figure 23: Creation of a new RPC object

RPC-object creation Figure 23 shows the procedure of creating a new RPC object.
The server component has already created an entrypoint, which, in turn, created its
corresponding object pool.

1. The server component creates an instance of an RPC object. “RPC object” denotes
an object that inherits the RPC object class template typed with the RPC interface
and that implements the virtual functions of this interface. By inheriting the RPC
object class template, it gets equipped with a dispatch function for the given RPC
interface.
Note that a single entrypoint can be used to manage any number of RPC objects
of arbitrary types.

2. The server component associates the RPC object with the entrypoint by calling
the entrypoint’s manage function with the RPC object as argument. The entry-
point responds to this call by allocating a new object identity using a session to
core’s PD service (Section 3.4.4). For allocating the new object identity, the entry-
point specifies the untyped capability of its IPC server as argument. Core’s PD
service responds to the request by instructing the kernel to create a new object
identity associated with the untyped capability. Thereby, the kernel creates a new
capability that is derived from the untyped capability. When invoked, the derived
capability refers to the same IPC server as the original untyped capability. But it

83
3.6 Inter-component communication

Client Server
RPC Object RPC object
Entrypoint pool a
invoke
capa
lookup by
capa
lookup

return a

dispatch

RPC function
reply

Figure 24: Invocation of an RPC object

represents a distinct object identity. The IPC server retrieves the local name of this
object identity when called via the derived capability. The entrypoint stores the
association of the derived capability with the RPC object in the object pool.

3. The entrypoint hands out the derived capability as return value of the manage
function. At this step, the derived capability is converted into a typed capability
with its type corresponding to the type of the RPC object that was specified as
argument. This way, the link between the types of the RPC object and the corre-
sponding capability is preserved at the C++ language level.

4. The server delegates the capability to another component, e. g., as payload of a


remote procedure call. At this point, the client receives the authority to call the
RPC object.

RPC-object invocation Figure 24 shows the flow of execution when a client calls an
RPC object by invoking a capability.

1. The client invokes the given capability using an instance of an RPC client object,
which uses the IPC library to invoke the kernel’s IPC mechanism. The kernel
delivers the request to the IPC server that belongs to the invoked capability and

84
3.6 Inter-component communication

wakes up the corresponding entrypoint. On reception of the request, the entry-


point obtains the local name of the invoked object identity.

2. The entrypoint uses the local name of the invoked object identity as a key into its
object pool to look up the matching RPC object. If the lookup fails, the entrypoint
replies with an error.

3. If the matching RPC object was found, the entrypoint calls the RPC object’s dis-
patch method. This method is implemented by the server-side stub code. It con-
verts the content of the receive buffer of the IPC server to a method call. I.e., it
obtains the opcode of the RPC function from the receive buffer to decide which
method to call, and supplies the arguments according to the definition in the RPC
interface.

4. On the return of the RPC function, the RPC stub code populates the send buffer
of the IPC server with the function results and invokes the kernel’s reply opera-
tion via the IPC library. Thereby, the entrypoint becomes ready to serve the next
request.

5. When delivering the reply to the client, the kernel resumes the execution of the
client, which can pick up the results of the RPC call.

85
3.6 Inter-component communication

3.6.2. Asynchronous notifications

The synchronous RPC mechanism described in the previous section is not sufficient to
cover all forms of inter-component interactions. It shows its limitations in the following
situations.

Waiting for multiple conditions


In principle, the RPC mechanism can be used by an RPC client to block for a con-
dition at a server. For example, a timer server could provide a blocking sleep
function that, when called by a client, blocks the client for a certain amount of
time. However, if the client wanted to respond to multiple conditions such as a
timeout, incoming user input, and network activity, it would need to spawn one
thread for each condition where each thread would block for a different condi-
tion. If one condition triggers, the respective thread would resume its execution
and respond to the condition. However, because all threads could potentially be
woken up independently from each other – as their execution depends only on
their respective condition – they need to synchronize access to shared state. Con-
sequently, components that need to respond to multiple conditions would not
only waste threads but also suffer from synchronization overhead.
At the server side, the approach of blocking RPC calls is equally bad in the pres-
ence of multiple clients. For example, a timer service with the above outlined
blocking interface would need to spawn one thread per client.

Signaling events to untrusted parties


With merely synchronous RPC, a server cannot deliver sporadic events to its
clients. If the server wanted to inform one of its clients about such an event, it
would need to act as a client itself by performing an RPC call to its own client.
However, by performing an RPC call, the caller passes the control of execution
to the callee. In the case of a server that serves multiple clients, it would put the
availability of the server at the discretion of all its clients, which is unacceptable.
A similar situation is the interplay between a parent and a child where the parent
does not trust its child but still wishes to propagate sporadic events to the child.

The solution to those problems is the use of asynchronous notifications, also named
signals. Figure 25 shows the interplay between two components. The component la-
beled as signal handler responds to potentially many external conditions propagated
as signals. The component labeled as signal producer triggers a condition. Note that
both can be arbitrary components.

86
3.6 Inter-component communication

Signal producer Signal handler Core


PD
Entrypoint

create Signal
context c
manage c
allocate capability

Signal delegate
create signal-context
transmitter capc
capability capc
for capc

submit
submit
block

context c count 2

block

submit wake up
context c count 1

Figure 25: Interplay between signal producer and signal handler

Signal-context creation and delegation The upper part of Figure 25 depicts the
steps needed by a signal handler to become able to receive asynchronous notifications.

1. Each Genode component is equipped with at least one initial entrypoint that re-
sponds to incoming RPC requests as well as asynchronous notifications. Similar
to how it can handle requests for an arbitrary number of RPC objects, it can re-
ceive signals from many different sources. Within the signal-handler component,
each source is represented as a so-called signal context. A component that needs
to respond to multiple conditions creates one signal context for each condition. In
the figure, a signal context “c” is created.

2. The signal-handler component associates the signal context with its entrypoint
via the manage method. Analogous to the way how RPC objects are associated
with entrypoints, the manage method returns a capability for the signal context.
Under the hood, the entrypoint uses core’s PD service to create this kind of capa-
bility.

87
3.6 Inter-component communication

3. As for regular capabilities, a signal-context capability can be delegated to other


components. Thereby, the authority to trigger signals for the associated context is
delegated.

Triggering signals The lower part of Figure 25 illustrates the use of a signal-context
capability by the signal producer.

1. Now in possession of the signal-context capability, the signal producer creates a


so-called signal transmitter for the capability. The signal transmitter can be used to
trigger a signal by calling the submit method. This method returns immediately.
In contrast to a remote procedure call, the submission of a signal is a fire-and-
forget operation.

2. At the time when the signal producer submitted the first signal, the signal handler
is not yet ready to handle them. It is still busy with other things. Once the signal
handler becomes ready to receive a new signal, the pending signal is delivered,
which triggers the execution of the corresponding signal-handler method. Note
that signals are not buffered. If signals are triggered at a high rate, multiple signals
may result in only a single execution of the signal handler. For this reason, the
handler cannot infer the number of events from the number of signal-handler
invocations. In situations where such information is needed, the signal handler
must retrieve it via another mechanism such as an RPC call to query the most
current status of the server that produced the signals.

3. After handling the first batch of signals, the signal handler component blocks and
becomes ready for another signal or RPC request. This time, no signals are im-
mediately pending. After a while, however, the signal producer submits another
signal, which eventually triggers another execution of the signal handler.

In contrast to remote procedure calls, signals carry no payload. If signals carried any
payload, this payload would need to be buffered somewhere. Regardless of where this
information is buffered, the buffer could overrun if signals are submitted at a higher
rate than handled. There might be two approaches to deal with this situation. The first
option would be to drop the payload once the buffer overruns, which would make the
mechanism indeterministic, which is hardly desirable. The second option would be to
sacrifice the fire-and-forget semantics at the producer side, blocking the producer when
the buffer is full. However, this approach would put the liveliness of the producer at
the whim of the signal handler. Consequently, signals are void of any payload.

88
3.6 Inter-component communication

Client Core Server


PD PD allocate dataspace

ate return
cre
Data capability capds
space
attach
dataspace capds
Data
space
return server-local address
delegate capds
attach

access
dataspace capds
Data
space
return
client-local address
ss
acce

Figure 26: Establishing shared memory between client and server. The server interacts with
core’s PD service. Both client and server interact with the region maps of their re-
spective PD sessions at core.

3.6.3. Shared memory

By sharing memory between components, large amounts of information can be propa-


gated across protection-domain boundaries without the active involvement of the ker-
nel.
Sharing memory between components raises a number of questions. First, Section 3.3
explained that physical memory resources must be explicitly assigned to components
either by their respective parents or by the means of resource trading. This raises the
question of which component is bound to pay for the memory shared between multiple
components. Second, unlike traditional operating systems where different programs
can refer to globally visible files and thereby establish shared memory by mapping a
prior-agreed file into their respective virtual memory spaces, Genode does not have a
global name space. How do components refer to the to-be-shared piece of memory?
Figure 26 answers these questions showing the sequence of shared-memory establish-
ment between a server and its client. The diagram depicts a client, core, and a server.
The notion of a client-server relationship is intrinsic for the shared-memory mecha-
nism. When establishing shared memory between components, the component’s roles
as client and server must be clearly defined.

89
3.6 Inter-component communication

1. The server interacts with core’s PD service to allocate a new RAM dataspace. Be-
cause the server uses its own PD session for that allocation, the dataspace is paid
for by the server. At first glance, this seems contradictory to the principle that
clients should have to pay for using services as discussed in Section 3.3.2. How-
ever, this is not the case. By establishing the client-server relationship, the client
has transferred a budget of RAM to the server via the session-quota mechanism.
So the client already paid for the memory. Still, it is the server’s responsibility to
limit the size of the allocation to the client’s session quota.
Because the server allocates the dataspace, it is the owner of the dataspace. Hence,
the lifetime of the dataspace is controlled by the server.
Core’s PD service returns a dataspace capability as the result of the allocation.

2. The server makes the content of the dataspace visible in its virtual address space
by attaching the dataspace within the region map of its PD session. The server
refers to the dataspace via the dataspace capability as returned from the prior
allocation. When attaching the dataspace to the server’s region map, core’s PD
service maps the dataspace content at a suitable virtual-address range that is not
occupied with existing mappings and returns the base address of the occupied
range to the server. Using this base address and the known dataspace size, the
server can safely access the dataspace content by reading from or writing to its
virtual memory.

3. The server delegates the authority to use the dataspace to the client. This dele-
gation can happen in different ways, e. g., the client could request the dataspace
capability via an RPC function at the server. But the delegation could also involve
further components that transitively delegate the dataspace capability. Therefore,
the delegation operation is depicted as a dashed line.

4. Once the client has obtained the dataspace capability, it can use the region map
of its own PD session to make the dataspace content visible in its address space.
Note that even though both client and server use core’s PD service, each compo-
nent uses a different session. Analogous to the server, the client receives a client-
local address within its virtual address space as the result of the attach operation.

5. After the client has attached the dataspace within its region map, both client and
server can access the shared memory using their respective virtual addresses.

In contrast to the server, the client is not in control over the lifetime of the dataspace.
In principle, the server, as the owner of the dataspace, could free the dataspace at its
PD session at any time and thereby revoke the corresponding memory mappings in
all components that attached the dataspace. The client has to trust the server with
respect to its liveliness, which is consistent with the discussion in Section 3.2.4. A well-
behaving server should tie the lifetime of a shared-memory dataspace to the lifetime of

90
3.6 Inter-component communication

the client session. When the server frees the dataspace at its PD session, core implic-
itly detaches the dataspace from all region maps. Thereby the dataspace will become
inaccessible to the client.

3.6.4. Asynchronous state propagation

In many cases, the mere information that a signal occurred is insufficient to handle
the signal in a meaningful manner. For example, a component that registers a timeout
handler at a timer server will eventually receive a timeout. But in order to handle the
timeout properly, it needs to know the actual time. The time could not be delivered
along with the timeout because signals cannot carry any payload. But the timeout
handler may issue a subsequent RPC call to the timer server for requesting the time.
Another example of this combination of asynchronous notifications and remote pro-
cedure calls is the resource-balancing protocol described in Section 3.3.4.

3.6.5. Synchronous bulk transfer

The synchronous RPC mechanism described in Section 3.6.1 enables components to ex-
change information via a kernel operation. In contrast to shared memory, the kernel
plays an active role by copying information (and delegating capabilities) between the
communication partners. Most kernels impose a restriction onto the maximum mes-
sage size. To comply with all kernels supported by Genode, RPC messages must not
exceed a size of 1 KiB. In principle, larger payloads could be transferred as a sequence
of RPCs. But since each RPC implies the costs of two context switches, this approach
is not suitable for transferring large bulks of data. But by combining synchronous RPC
with shared memory, these costs can be mitigated.
Figure 27 shows the procedure of transferring large bulk data using shared memory
as a communication buffer while using synchronous RPCs for arbitrating the use of
the buffer. The upper half of the figure depicts the setup phase that needs to be per-
formed only once. The lower half exemplifies an operation where the client transfers a
large amount of data to the server, which processes the data before transferring a large
amount of data back to the client.

1. At session-creation time, the server allocates the dataspace, which represents the
designated communication buffer. The steps resemble those described in Section
3.6.3. The server uses session quota provided by the client for the allocation. This
way, the client is able to aid the dimensioning of the dataspace by supplying an
appropriate amount of session quota to the server. Since the server performed the
allocation, the server is in control of the lifetime of the dataspace.

2. After the client established a session to the server, it initially queries the dataspace
capability from the server using a synchronous RPC and attaches the dataspace

91
3.6 Inter-component communication

Client Server

ch
,a tta
request dataspace ate
cre
return dataspace capability
attach

Data
te space
wri re
ad
,w
ri

ad
RPC call te
re

dispatch
RPC reply

Figure 27: Transferring bulk data by combining synchronous RPC with shared memory

to its own address space. After this step, both client and server can read and write
the shared communication buffer.

3. Initially the client plays the role of the user of the dataspace. The client writes the
bulk data into the dataspace. Naturally, the maximum amount of data is limited
by the dataspace size.

4. The client performs an RPC call to the server. Thereby, it hands over the role of the
dataspace user to the server. Note that this handover is not enforced. The client’s
PD retains the right to access the dataspace, i. e., by another thread running in the
same PD.

5. On reception of the RPC, the server becomes active. It reads and processes the
bulk data, and writes its results to the dataspace. The server must not assume
to be the exclusive user of the dataspace. A misbehaving client may change the
buffer content at any time. Therefore, the server must take appropriate precau-
tions. In particular, if the data must be validated at the server side, the server must
copy the data from the shared dataspace to a private buffer before validating and
using it.

6. Once the server has finished processing the data and written the results to the
dataspace, it replies to the RPC. Thereby, it hands back the role of the user of the
dataspace to the client.

7. The client resumes its execution with the return of the RPC call, and can read the
result of the server-side operation from the dataspace.

92
3.6 Inter-component communication

shared between source and sink

ack
queue
5
get acked 4
acknowledge
release
Source alloc bulk buffer Sink
1

get
submit 3
2
submit
queue

Figure 28: Life cycle of a data packet transmitted over the packet-stream interface

The RPC call may be used for carrying control information. For example, the client may
provide the amount of data to process, or the server may provide the amount of data
produced.

3.6.6. Asynchronous bulk transfer - packet streams

The packet-stream interface complements the facilities for the synchronous data trans-
fer described in Sections 3.6.1 and 3.6.5 with a mechanism that carries payload over a
shared memory block and employs an asynchronous data-flow protocol. It is designed
for large bulk payloads such as network traffic, block-device data, video frames, and
USB URB payloads.
As illustrated in Figure 28, the communication buffer consists of three parts: a submit
queue, an acknowledgement queue, and a bulk buffer. The submit queue contains
packets generated by the source to be processed by the sink. The acknowledgement
queue contains packets that are processed and acknowledged by the sink. The bulk
buffer contains the actual payload. The assignment of packets to bulk-buffer regions is
performed by the source.
A packet is represented by a packet descriptor that refers to a portion of the bulk
buffer and contains additional control information. Such control information may in-
clude an opcode and further arguments interpreted at the sink to perform an operation
on the supplied packet data. Either the source or the sink is in charge of handling a
given packet at a given time. At the points 1, 2, and 5, the packet is owned by the

93
3.6 Inter-component communication

source. At the points 3 and 4, the packet is owned by the sink. Putting a packet descrip-
tor in the submit or acknowledgement queue represents a handover of responsibility.
The life cycle of a single packet looks as follows:

1. The source allocates a region of the bulk buffer for storing the packet payload
(packet alloc). It then requests the local pointer to the payload (packet content) and
fills the packet with data.

2. The source submits the packet to the submit queue (submit packet).

3. The sink requests a packet from the submit queue (get packet), determines the local
pointer to the payload (packet content), and processes the contained data.

4. After having finished the processing of the packet, the sink acknowledges the
packet (acknowledge packet), placing the packet into the acknowledgement queue.

5. The source reads the packet from the acknowledgement queue and releases the
packet (release packet). Thereby, the region of the bulk buffer that was used by the
packet becomes marked as free.

This protocol has four corner cases that are handled by signals:

Saturated submit queue Under this condition, the source is not able to submit an-
other packet and may decide to block. Once the sink observes such a condition
being cleared - that is when removing a packet from a formerly saturated submit
queue - it delivers a ready-to-submit signal to wake up the source.

Submit queue is empty Whenever the source places a packet into an empty submit
queue, it assumes that the sink may have blocked for the arrival of new packets
and delivers a packet-avail signal to wake up the sink.

Saturated acknowledgement queue Unless the acknowledgement queue has enough


capacity for another acknowledgement, the sink is unable to make progress and
may therefore block. Once the source consumes an acknowledgement from a
formerly saturated acknowledgement queue, it notifies the sink about the cleared
condition by delivering a ready-to-ack signal.

Acknowledgement queue is empty In this case, the source may block until the sink
places another acknowledged packet into the formerly empty acknowledgement
queue and delivers an ack-avail signal.

If bidirectional data exchange between a client and a server is desired, there are two
approaches:

94
3.6 Inter-component communication

One stream of operations If data transfers in either direction are triggered by the
client only, a single packet stream where the client acts as the source and the server
represents the sink can accommodate transfers in both directions. For example,
the block session interface (Section 4.5.9) represents read and write requests as
packet descriptors. The allocation of the operation’s read or write buffer within
the bulk buffer is performed by the client, being the source of the stream of oper-
ations. For write operations, the client populates the write buffer with the to-be-
written information before submitting the packet. When the server processes the
incoming packets, it distinguishes the read and write operations using the control
information given in the packet descriptor. For a write operation, it processes the
information contained in the packet. For a read operation, it populates the packet
with new information before acknowledging the packet.

Two streams of data If data transfers in both directions can be triggered indepen-
dently from client and server, two packet streams can be used. For example, the
NIC session interface (Section 4.5.11) uses one packet stream for ingoing and one
packet stream for outgoing network traffic. For outgoing traffic, the client plays
the role of the source. For incoming traffic, the server (such as a NIC driver) is the
source.

95
4. Components

The architecture introduced in Chapter 3 clears the way to compose sophisticated sys-
tems out of many building blocks. Each building block is represented by an individual
component that resides in a dedicated protection domain and interacts with other com-
ponents in a well-defined manner. Those components do not merely represent applica-
tions but all typical operating-system functionalities.
Components can come in a large variety of shape and form. Compared to a mono-
lithic operating-system kernel, a component-based operating system challenges the sys-
tem designer by enlarging the design space with the decision of the functional scope of
each component and thereby the granularity of componentization. This decision de-
pends on several factors:

Security The smaller a component, the lower the risk for bugs and vulnerabilities.
The more rigid a component’s interfaces, the smaller its attack surface becomes.
Hence, the security of a complex system function can potentially be vastly im-
proved by splitting it into a low-complexity component that encapsulates the
security-critical part and a high-complexity component that is uncritical for se-
curity.

Performance The split of functionality into multiple components introduces inter-


component communication and thereby context-switch overhead. If a functional-
ity is known to be performance critical, such a split should clearly be motivated
by a benefit for security.

Reusability Componentization can be pursued to improve reusability while some-


times disregarding performance considerations at the same time. However,
reusability can also be achieved by moving functionality into libraries that can
easily be reused by linking them directly against library-using components. By
using a dynamic linker, linking can even happen at run time, which yields the
same flexibility as the use of multiple distinct components. Therefore, the split of
functionality into multiple components for the sole sake of modularization has to
be questioned.

Sections 4.1, 4.2, 4.3, and 4.4 aid the navigation within the componentization design
space by discussing the different roles a component can play within a Genode system.
This can be the role of a device driver, protocol stack, resource multiplexer, runtime
environment, and that of an application. By distinguishing those roles, it becomes pos-
sible to assess the possible security implications of each individual component.
The versatility of a component-based system does not come from the existence of
many components alone. Even more important is the composability of components.
Components can be combined only if their interfaces match. To maximize composabil-
ity, the number of interfaces throughout the system should be as low as possible, and

96
all interfaces should be largely orthogonal to each other. Section 4.5 reviews Genode’s
common session interfaces.
Components can be used in different ways depending on their configuration and
their position within the component tree. Section 4.6 explains how a component obtains
and processes its configuration. Section 4.7 discusses the most prominent options of
composing components.

97
4.1 Device drivers

Block-device driver File system


Block session

Core
IO-MEM session IRQ session

Figure 29: A block-device driver provides a block service to a single client and uses core’s IO-
MEM and IRQ services to interact with the physical block-device controller.

4.1. Device drivers

A device driver translates a device interface to a Genode session interface. Figure 29


illustrates the typical role of a device driver.
The device interface is defined by the device vendor and typically comprises the
driving of state machines of the device, the notification of device-related events via
interrupts, and the means to transfer data from and to the device. In principle, a
device-driver component may access device hardware via sessions to the low-level
core services IO_MEM, IO_PORT, and IRQ as described in Section 3.4.7. However,
most practical scenarios benefit from the sandboxed operation of device drivers and
the fine-grained segregation of device hardware between drivers, which is enabled by
the platform driver covered in Section 4.1.1.
In general, a physical device cannot safely be driven by multiple users at the same
time. If multiple users accessed one device concurrently, the device state would eventu-
ally become inconsistent. A device driver should not attempt to multiplex a hardware
device. Instead, to keep its complexity low, it usually acts as a server that serves only
a single client per physical device. Whereas a device driver for a simple device accepts
only one client, a device driver for a complex device with multiple sub devices (such
as a USB driver) may hand out each sub device to a different client. Whenever reason-
ably possible, a driver should best be implemented as a mere client, not as a server.
For example, by asserting the role of a capture client at a GUI server, a display driver
can be regarded as a disposable component that can be restarted or swapped out with-
out affecting the GUI stack. The driver depends on the GUI server, but the GUI server
does not depend on the driver, which helps to make the system resilient against driver
failures.

98
4.1 Device drivers

A device driver should be largely void of built-in policy. If it merely translates the
interface of a single device to a session interface, there is not much room for policy
anyway. If, however, a device driver hands out multiple sub devices to different clients,
the assignment of sub devices to clients must be subjected to a policy. In this case, the
device driver should obtain policy information from its configuration as provided by
the driver’s parent.

4.1.1. Platform driver

There are three problems that are fundamentally important for running an operating
system on modern hardware but that lie outside the scope of an ordinary device driver
because they affect the platform as a whole rather than a single device. Those problems
are the enumeration of devices, the discovery of interrupt routing, and the initial setup
of the platform.

Problem 1: Device enumeration Modern hardware platforms are rather complex


and vary a lot. For example, the devices attached to the PCI bus of a PC are usually
not known at the build time of the system but need to be discovered at run time. Tech-
nically, each individual device driver could probe its respective device at the PCI bus.
But in the presence of multiple drivers, this approach would hardly work. First, the
configuration interface of the PCI bus is a device itself. The concurrent access to the
PCI configuration interface by multiple drivers would ultimately yield undefined be-
haviour. Second, for being able to interact directly with the PCI configuration interface,
each driver would need to carry with it the functionality to interact with PCI.

Problem 2: Interrupt routing On PC platforms with multiple processors, the use of


legacy interrupts as provided by the Intel 8259 programmable interrupt controller (PIC)
is not suitable because there is no way to express the assignment of interrupts to CPUs.
To overcome the limitations of the PIC, Intel introduced the Advanced Programmable
Interrupt Controller (APIC). The APIC, however, comes with a different name space
for interrupt numbers, which creates an inconsistency between the numbers provided
by the PCI configuration (interrupt lines) and interrupt numbers as understood by the
APIC. The assignment of legacy interrupts to APIC interrupts is provided by the Ad-
vanced Configuration and Power Interface (ACPI) tables. Consequently, in order to
support multi-processor PC platforms, the operating system needs to interpret those
tables. Within a component-based system, we need to answer the question of which
component is responsible to interpret the ACPI tables and how this information is ap-
plied to individual device drivers.

Problem 3: Initial hardware setup In embedded systems, the interaction of the SoC
(system on chip) with its surrounding peripheral hardware is often not fixed in hard-
ware but rather a configuration issue. For example, the power supply and clocks of

99
4.1 Device drivers

certain peripherals may be enabled by speaking an I2C protocol with a separate power-
management chip. Also, the direction and polarity of the general-purpose I/O pins
depends largely on the way how the SoC is used. Naturally, such hardware setup steps
could be performed by the kernel. But this would require the kernel to become aware
of potentially complex platform intrinsics.

Central platform driver The natural solution to these problems is the introduction
of a so-called platform driver, which encapsulates the peculiarities outlined above. On
PC platforms, the role of the platform driver is executed by the ACPI driver. The ACPI
driver provides an interface to the PCI bus in the form of a PCI service. Device drivers
obtain the information about PCI devices by creating a PCI session at the ACPI driver.
Furthermore, the ACPI driver provides an IRQ service that transparently applies the
interrupt routing based on the information provided by the ACPI tables. Furthermore,
the ACPI driver provides the means to allocate DMA buffers, which is further explained
in Section 4.1.3.
On ARM platforms, the corresponding component is named platform driver and
provides a so-called platform service. Because of the large variety of ARM-based SoCs,
the session interface for this service differs from platform to platform.

4.1.2. Interrupt handling

Most device drivers need to respond to sporadic events produced by the device and
propagated to the CPU as interrupts. In Genode, a device-driver component obtains
device interrupts via core’s IRQ service introduced in Section 3.4.7. On PC platforms,
device drivers usually do not use core’s IRQ service directly but rather use the IRQ
service provided by the platform driver (Section 4.1.1).

4.1.3. Direct memory access (DMA) transactions

Devices that need to transfer large amounts of data usually support a means to issue
data transfers from and to the system’s physical memory without the active participa-
tion of the CPU. Such transfers are called direct memory access (DMA) transactions. DMA
transactions relieve the CPU from actively copying data between device registers and
memory, optimize the throughput of the system bus by the effective use of burst trans-
fers, and may even be used to establish direct data paths between devices. However,
the benefits of DMA come at the risk of corrupting the physical memory by misguided
DMA transactions. Because those DMA-capable devices can issue bus requests that
target the physical memory directly while not involving the CPU altogether, such re-
quests are naturally not subjected to the virtual-memory mechanism implemented in
the CPU in the form of a memory-management unit (MMU). Figure 30 illustrates the
problem. From the device’s point of view, there is just physical memory. Hence, if a
driver sets up a DMA transaction, e. g., if a disk driver wants to read a block from the

100
4.1 Device drivers

Driver
Memory Disk
controller controller
Application DMA
CPU MMU
system bus
Kernel

Figure 30: The MMU restricts the access of physical memory pages by different components
according to their virtual address spaces. However, direct memory accesses issued
by the disk controller are not subjected to the MMU. The disk controller can access
the entirety of the physical memory present in the system.

disk, it programs the memory-mapped registers of the device with the address and size
of a physical-memory buffer where it expects to receive the data. If the driver lives
in a user-level component, as is the case for a Genode-based system, it still needs to
know the physical address of the DMA buffer to program the device correctly. Unfor-
tunately, there is nothing to prevent the driver from specifying any physical address
to the device. A malicious driver could misuse the device to read and manipulate all
parts of the physical memory, including the kernel. Consequently, device drivers and
devices should ideally be trustworthy. However, there are several scenarios where this
is ultimately not the case.

Scenario 1: Direct device assignment to virtual machines When hosting virtual


machines as Genode components, the direct assignment of a physical device such as
a USB controller, a GPU, or a dedicated network card to the guest OS running in the
virtual machine can be useful in two ways. First, if the guest OS is the sole user of the
device, direct assignment of the device maximizes the I/O performance of the guest
OS using the device. Second, the guest OS may be equipped with a proprietary device
driver that is not present as a Genode component otherwise. In this case, the guest OS
may be used as a runtime that executes the device driver, and thus, provides a driver
interface to the Genode world. In both cases the guest OS should not be considered
as trustworthy. On the contrary, it bears the risk of subverting the isolation between
components. A misbehaving guest OS could issue DMA requests referring to the phys-
ical memory used by other components or even the kernel, and thereby break out of its
virtual machine.

101
4.1 Device drivers

Scenario 2: Firmware-driven attacks Modern peripherals such as wireless LAN


adaptors, network cards, or GPUs employ firmware executed on the peripheral device.
This firmware is executed on a microcontroller on the device, and is thereby not sub-
jected to the policy of the normal operating system. Such firmware may either be built-
in by the device vendor, or is loaded by the device driver at initialization time of the
device. In both cases, the firmware tends to be a black box that remains obscure with
the exception of the device vendor. Hidden functionality or vulnerabilities might be
present in it. By the means of DMA transactions, such firmware has unlimited access to
the system. For example, a back door implemented in the firmware of a network adap-
tor could look for special network packets to activate and control arbitrary spyware.
Because malware embedded in the firmware of the device can neither be detected nor
controlled by the operating system, both monolithic and microkernel-based operating
systems are powerless against such attacks.

Scenario 3: Bus-level attacks The previous examples misuse a DMA-capable de-


vice as a proxy to drive an attack. However, the system bus can be attacked directly
with no hardware tinkering at all. There are ready-to-exploit interfaces that are featured
on most PC systems. For example, most laptops come with PCMCIA / Express-Card
slots, which allow expansion cards to access the system bus. Furthermore, serial bus
interfaces, e. g., IEEE 1394 (Firewire), enable connected devices to indirectly access the
system bus via the peripheral bus controller. If the bus controller allows the device to
issue direct system bus requests by default, a connected device becomes able to gain
control over the whole system.

DMA transactions in component-based systems Direct memory access (DMA) of


devices looks like the Achilles heel of component-based operating systems. The most
compelling argument in favor of componentization is that by encapsulating each sys-
tem component within a dedicated user-level address space, the system as a whole
becomes more robust and secure compared to a monolithic operating-system kernel.
In the event that one component fails due to a bug or an attack, other components re-
main unaffected. The prime example for such buggy components are, however, device
drivers. By empirical evidence, those remain the most prominent trouble makers in to-
day’s operating systems, which suggests that the DMA loophole renders the approach
of component-based systems largely ineffective. However, there are three counter ar-
guments to this observation.
First, by encapsulating each driver in a dedicated address space, classes of bugs
that are unrelated to DMA remain confined in the driver component. In practice
most driver-related problems stem from issues like memory leaks, synchronization
problems, deadlocks, flawed driver logic, wrong state machines, or incorrect device-
initialization sequences. For those classes of problems, the benefits of isolating the
driver in a dedicated component still apply.

102
4.1 Device drivers

DMA
Driver
Memory Disk
controller controller
Application
CPU MMU IOMMU
system bus
Kernel

Figure 31: An IOMMU arbitrates and virtualizes DMA accesses issued by a device to the RAM.
Only if a valid IOMMU mapping exists for a given DMA access, the memory access
is performed.

Second, executing a driver largely isolated from other operating-system code min-
imizes the attack surface onto the driver. If the driver interface is rigidly small and
well-defined, it is hard to compromise the driver by exploiting its interface.
Third, modern PC hardware has closed the DMA loophole by incorporating so-called
IOMMUs into the system. As depicted in Figure 31, the IOMMU sits between the physi-
cal memory and the system bus where the devices are attached to. So each DMA request
has to go through the IOMMU, which is not only able to arbitrate the access of DMA
requests to the RAM but is also able to virtualize the address space per device. Similar
to how an MMU confines each process running on the CPU within a distinct virtual
address space, the IOMMU is able to confine each device within a dedicated virtual
address space. To tell the different devices apart, the IOMMU uses the PCI device’s
bus-device-function triplet as unique identification.
With an IOMMU in place, the operating system can effectively limit the scope of
actions the given device can execute on the system. I.e., by restricting all accesses orig-
inating from a particular PCI device to the DMA buffers used for the communication,
the operating system becomes able to detect and prevent any unintended bus accesses
initiated by the device.
When executed on the NOVA kernel, Genode subjects all DMA transactions to the
IOMMU, if present. Section 7.8.7 discusses the use of IOMMUs in more depth.

103
4.2 Protocol stacks

GUI server Terminal


Application
GUI Terminal
session session

Figure 32: Example of a protocol stack. The terminal provides the translation between the
terminal-session interface (on the right) and the GUI-session interface (on the left).

4.2. Protocol stacks

A protocol stack translates one session interface to another (or the same) session inter-
face. For example, a terminal component may provide a command-line application
with a service for obtaining textual user input and for printing text. To implement this
service, the terminal uses a GUI session. Figure 32 depicts the relationship between the
GUI server, the terminal, and its client application. For realizing the output of a stream
of characters on screen, it implements a parser for escape sequences, maintains a state
machine for the virtual terminal, and renders the pixel representation of characters onto
the virtual framebuffer of the GUI session. For the textual user input, it responds to key
presses reported by the GUI session’s input stream by converting input events into a
character stream as understood by terminal applications. When viewed from the out-
side of the component, the terminal translates a terminal session to a GUI session.
Similar to a device driver, a protocol stack typically serves a single client. In contrast
to device drivers, however, protocol stacks are not bound to physical devices. There-
fore, a protocol stack can be instantiated any number of times. For example, if multiple
terminals are needed, one terminal component could be instantiated per terminal. Be-
cause each terminal uses an independent instance of the protocol stack, a bug in the
protocol stack of one terminal does not affect any other terminal. However complex
the implementation of the protocol stack may be, it is not prone to leaking information
to another terminal because it is connected to a single client only. The leakage of in-
formation is constrained to interfaces used by the individual instance. Hence, in cases
like this, the protocol-stack component is suitable for hosting highly complex untrusted
code if such code cannot be avoided.
Note that the example above cannot be generalized for all protocol stacks. There are
protocol stacks that are critical for the confidentiality of information. For example, an
in-band encryption component may translate plain-text network traffic to encrypted
network traffic designated to be transported over a public network. Even though the
component is a protocol stack, it may still be prone to leaking unencrypted information
to the public network.
Whereas protocol stacks are not necessarily critical for integrity and confidentiality,
they are almost universally critical for availability.

104
4.3 Resource multiplexers

Application
Framebuffer driver
GUI server
GUI session
Capture session
Application
GUI session
Event session
Input driver GUI session

Application

Figure 33: A GUI server multiplexes the physical framebuffer and input devices among multi-
ple applications.

4.3. Resource multiplexers

A resource multiplexer transforms one resource into a number of virtual resources. A


resource is typically a physical device. For example, a NIC-router component may
use a NIC driver as uplink and, in turn, provides a NIC service where each session
represents a virtual NIC. Another example is a GUI server as depicted in Figure 33,
which enables multiple applications to share the same physical framebuffer and input
devices by presenting each client in a window or a virtual console.
In contrast to a typical device driver or protocol stack that serves only a single client, a
resource multiplexer is shared by potentially many clients. In the presence of untrusted
clients besides security-critical clients, a resource multiplexer ultimately becomes a so-
called multi-level component. This term denotes that the component is cross-cutting the
security levels of all its clients. This has the following ramifications.

Covert channels Because the component is a shared resource that is accessed by


clients of different security levels, it must maintain the strict isolation between
its clients unless explicitly configured otherwise. Hence, the component’s client
interface as well as the internal structure must be designed to prevent the leakage
of information across clients. I.e., two clients must never share the same names-
pace of server-side objects if such a namespace can be modified by the clients.
For example, a window server that hands out global window IDs to its clients
is prone to unintended information leakage because one client could observe the
allocation of window IDs by another client. The ID allocation could be misused
as a covert channel that circumvents security policies. In the same line, a resource
multiplexer is prone to timing channels if the operations provided via its client

105
4.3 Resource multiplexers

interface depends on the behavior of other clients. For this reason, blocking RPC
calls should be avoided because the duration of a blocking operation may reveal
information about the internal state such as the presence of other clients of the
resource multiplexer.

Complexity is dangerous As a resource multiplexer is shared by clients of different


security levels, the same considerations apply as for the OS kernel: high com-
plexity poses a major risk for bugs. Such bugs may, in turn, result in the unin-
tended flow of information between clients or degrade the quality of service for
all clients. Hence, in terms of complexity, resource multiplexers must be as simple
as possible.

Denial of service The exposure of a resource multiplexer to untrusted and even mali-
cious clients makes it a potential target for denial-of-service attacks. Some opera-
tions provided by the resource multiplexer may require the allocation of memory.
For example, a GUI server may need memory for the book keeping of each win-
dow created by its clients. If the resource multiplexer performed such allocations
from its own memory budget, a malicious client could trigger the exhaustion of
server-side memory by creating new windows in an infinite loop. To mitigate
this category of problems, a resource multiplexer should perform memory alloca-
tions exclusively from client-provided resources, i. e., using the session quota as
provided by each client at session-creation time. Section 3.3 describes Genode’s
resource-trading mechanism in detail. In particular, resource multiplexers should
employ heap partitioning as explained in Section 3.3.3.

Avoiding built-in policies A resource multiplexer can be understood as a microker-


nel for a higher-level resource. Whereas a microkernel multiplexes or arbitrates
the CPU and memory between multiple components, a resource multiplexer does
the same for sessions. Hence, the principles for constructing microkernels equally
apply for resource multiplexers. In the line of those principles, a resource multi-
plexer should ideally implement sole mechanisms but should be void of built-in
policy.

Enforcement of policy Instead of providing a built-in policy, a resource multiplexer


obtains policy information from its configuration as supplied by its parent. The
resource multiplexer must enforce the given policy. Otherwise, the security policy
expressed in the configuration remains ineffective.

106
4.4 Runtime environments and applications

Child Child Child

Runtime environment
Parent Parent Parent

Figure 34: A runtime environment manages multiple child components.

4.4. Runtime environments and applications

The component types discussed in the previous sections have in common that they de-
liberately lack built-in policy but act according to a policy supplied by their respective
parents by the means of configuration. This raises the question where those policies
should come from. The answer comes in the form of runtime environments and appli-
cations.
A runtime environment as depicted in Figure 34 is a component that hosts child com-
ponents. As explained in the Sections 3.2 and 3.3, it is thereby able to exercise control
over its children but is also responsible to manage the children’s resources. A runtime
environment controls its children in three ways:
Session routing It is up to the runtime environment to decide how to route session
requests originating from a child. The routing of sessions is discussed in Section
3.2.3.
Configuration Each child obtains its configuration from its parent in the form of a
ROM session as described in Section 4.6. Using this mechanism, the runtime en-
vironment is able to feed policy information to its children. Of course, in order
to make the policy effective, the respective child has to interpret and enforce the
configuration accordingly.
Lifetime The lifetime of a child ultimately depends on its parent. Hence, a runtime
environment can destroy and possibly restart child components at any time.
With regard to the management of child resources, a runtime environment can employ
a large variety of policies using two principal approaches:
Quota management Using the resource-trading mechanisms introduced in Section
3.3, the runtime environment can assign resources to each child individually.
Moreover, if a child supports the dynamic rebalancing protocol described in Sec-
tion 3.3.4, the runtime environment may even change those assignments over the
lifetime of its children.

107
4.4 Runtime environments and applications

Interposing services Because the runtime environment controls the session routing
of each child, it is principally able to interpose the child’s use of any service in-
cluding those normally provided by core such as PD (Section 3.4.4), and CPU
(Section 3.4.6). The runtime environment may provide a locally implemented
version of those session interfaces instead of routing session requests directly to-
wards the core component. Internally, each session of such a local service may
create a session to the real core service, thereby effectively wrapping core’s ses-
sions. This way, the runtime environment can not only observe the interaction
of its child with core services but also implement custom resource-management
strategies, for example, sharing one single budget among multiple children.

Canonical examples of runtime environments are the init component that applies a
policy according to its configuration, a debugger that interposes all core services for the
debugging target, or a virtual machine monitor.
A typical application is a leaf node in the component tree that merely uses services. In
practice, however, the boundary between applications and runtime environments can
be blurry. As illustrated in Section 4.7, Genode fosters the internal split of applications
into several components, thereby forming multi-component applications. From the out-
side, such a multi-component application appears as a leaf node of the component tree
but internally, it employs an additional level of componentization by executing portions
of its functionality in separate child components. The primary incentive behind this ap-
proach is the sandboxing of untrusted application functionality. For example, a video
player may execute the video codec within a separate child component so that a bug in
the complex video codec will not compromise the entire video-player application.

108
4.5 Common session interfaces

4.5. Common session interfaces

The core services described in Section 3.4 principally enable the creation of a recur-
sively structured system. However, their scope is limited to the few low-level resources
provided by core, namely processing time, memory, and low-level device resources.
Device drivers (Section 4.1) and protocol stacks (Section 4.2) transform those low-level
resources into higher-level resources. Analogously to how core’s low-level resources
are represented by the session interfaces of core’s services, higher-level resources are
represented by the session interfaces provided by device drivers and protocol stacks.
In principle, each device driver could introduce a custom session interface representing
the particular device. But as discussed in the introduction of Chapter 4, a low number
of orthogonal session interfaces is desirable to maximize the composability of compo-
nents. This section introduces the common session interfaces that are used throughout
Genode.

4.5.1. Read-only memory (ROM)

The ROM session interface makes a piece of data in the form of a dataspace available
to the client.

Session creation At session-creation time, the client specifies the name of a ROM
module as session argument. One server may hand out different ROM modules de-
pending on the name specified. Once a ROM session has been created, the client can
request the capability of the dataspace that contains the ROM module. Using this ca-
pability and the region map of the client’s PD session, the client can attach the ROM
module to its local address space and thereby access the information. The client is ex-
pected to merely read the data, hence the name of the interface.

ROM module updates In contrast to the intuitive assumption that read-only data is
immutable, ROM modules may mutate during the lifetime of the session. The server
may update the content of the ROM module with new versions. However, the server
does not do so without the consent of the client. The protocol between client and server
consists of the following steps.

1. The client registers a signal handler at the server to indicate that it is interested in
receiving updates of the ROM module.

2. If the server has a new version of the ROM module, it does not immediately
change the dataspace shared with the client. Instead, it maintains the new ver-
sion separately and informs the client by submitting a signal to the client’s signal
handler.

109
4.5 Common session interfaces

3. The client continues working with the original version of the dataspace. Once
it receives the signal from the server, it may decide to update the dataspace by
calling the update function at the server.

4. The server responds to the update request. If the new version fits into the exist-
ing dataspace, the server copies the content of the new version into the existing
dataspace and returns this condition with the reply of the update call. Thereby,
the ROM session interface employs synchronous bulk transfers as described in
Section 3.6.5.

5. The client evaluates the result of the update call. If the new version did fit into
the existing dataspace, the update is complete at this point. However, if the new
version is larger than the existing dataspace, the client requests a new dataspace
from the server.

6. Upon reception of the dataspace request, the server destroys the original datas-
pace (thereby making it invisible to the client), and returns the new version of the
ROM module as a freshly allocated dataspace.

7. The client attaches the new dataspace capability to its local address space to access
the new version.

The protocol is designed in such a way that neither the client nor the server need to
support updates. A server with no support for updating ROM modules such as core’s
ROM service simply ignores the registration of a signal handler by a client. A client
that is not able to cope with ROM-module updates never requests the dataspace twice.
However, if both client and server support the update protocol, the ROM session
interface provides a means to propagate large state changes from the server to the client
in a transactional way. In the common case where the new version of a ROM module
fits into the same dataspace as the old version, the update does not require any memory
mappings to be changed.

Use cases The ROM session interface is used wherever data shall be accessed in a
memory mapped fashion.

• Boot time data comes in the form of the ROM sessions provided by core’s ROM
service. On some kernels, core exports kernel-specific information such as the
kernel version in the form of special ROM modules.

• If an executable binary is provided as a ROM module, the binary’s text segment


can be attached directly to the address space of a new process (Section 3.5). So
multiple instances of the same component effectively share the same text segment.
The same holds true for shared libraries. For this reason, executable binaries and
shared libraries are requested in the form of ROM sessions.

110
4.5 Common session interfaces

• Components obtain their configuration by requesting a ROM session for the “con-
fig” ROM module at their respective parent (Section 4.6). This way, configuration
information can be propagated using a simple interface with no need for a file
system. Furthermore, the update mechanism allows the parent to dynamically
change the configuration of a component during its lifetime.

• As described in Section 4.7.5, multi-component applications may obtain data


models in the form of ROM sessions. In such scenarios, the ROM session’s up-
date mechanism is used to propagate model updates in a transactional way.

4.5.2. Report

The report session interface allows a client to report its internal state to the outside
using synchronous bulk transfers (Section 3.6.5).

Session creation At session-creation time, the client specifies a label and a buffer
size. The label aids the routing of the session request but may also be used to select a
policy at the report server. The buffer size determines the size of the dataspace shared
between the report server and its client.

Use cases

• Components may use report sessions to export their internal state for monitoring
purposes or for propagating exceptional events.

• Device drivers may report information about detected devices or other resources.
For example, a bus driver may report a list of devices attached on the bus, or a
wireless driver may report the list of available networks.

• In multi-component applications, components that provide data models to other


components may use the report-session interface to propagate model updates.

4.5.3. Terminal and UART

The terminal session interface provides a bi-directional communication channel be-


tween client and server using synchronous bulk transfers (Section 3.6.5). It is primarily
meant to be used for textual interfaces but may also be used to transfer other serial
streams of data.
The interface uses the two RPC functions read and write to arbitrate the access to a
shared-memory communication buffer between client and server as described in Sec-
tion 3.6.5. The read function never blocks. When called, it copies new input into the
communication buffer and returns the number of new characters. If there is no new in-
put, it returns 0. To avoid the need to poll for new input at the client side, the client can

111
4.5 Common session interfaces

register a signal handler that gets notified upon the arrival of new input. The write func-
tion takes the number of to-be-written characters as argument. The server responds to
this function by processing the specified amount of characters from the communication
buffer.
Besides the actual read and write operations, the terminal supports the querying of
the number of new available input events (without reading it) and the terminal size in
rows and columns.

Session creation At session-creation time, the terminal session may not be ready to
use. For example, a TCP terminal session needs an established TCP connection first. In
such a situation, the use of the terminal session by a particular client must be deferred
until the session becomes ready. Delaying the session creation at the server side is not
an option because this would render the server’s entry point unavailable for all other
clients until the TCP connection is ready. Instead, the client blocks until the server
delivers a connected signal. This signal is emitted when the session becomes ready to
use. The client waits for this signal right after creating the session.

Use cases

• Device drivers that provide streams of characters in either direction.

• A graphical terminal.

• Transfer of streams of data over TCP (using the TCP terminal).

• Writing streams of data to a file (using a file terminal).

• User input and output of traditional command-line based software.

• Multiplexing of multiple textual user interfaces (using the terminal-mux compo-


nent).

• Headless operation and management of subsystems (using the CLI monitor).

UART The UART session interface complements the terminal session interface with
additional control functions, e. g., for setting the baud rate. Because UART sessions
are compatible to terminal sessions, a UART device driver can be used as both UART
server and terminal server.

4.5.4. Event

The event session interface is used to communicate low-level user-input events from
the client to the server using synchronous bulk transfers (Section 3.6.5). Such an event
can be of one of the following types:

112
4.5 Common session interfaces

press or release of a button or key. Each physical button (such as a mouse button) or
key (such as a key on a keyboard) is represented by a unique value. At the event-
session level, key events are reported as raw hardware events. They are reported
without a keyboard layout applied and without any interpretation of meta keys
(like shift, alt, and control). This gives the consumer of events the flexibility to
handle arbitrary combinations of keys.
A press event may be annotated with an optional character representation of the
pressed key in the form of a Unicode codepoint. Such events are not generated by
low-level device drivers but by a higher-level service - like the event-filer compo-
nent - that applies keyboard-layout rules to sequences of low-level events. Such
annotated press events can be readily consumed by components that operate on
textual input rather than low-level hardware events.

relative motion of pointer devices such as a mouse. Such events are generated by
device drivers.

absolute motion of pointer devices such as a touch screen or graphics tablet. Fur-
thermore absolute motion events are generated for virtual input devices such as
a system-global pointer position maintained by the GUI server and reported to
hovered GUI applications.

wheel motion of scroll wheels in vertical and horizontal directions.

focus of the session. Focus events are artificially generated by GUI servers to indicate a
gained or lost keyboard focus of a GUI application. The application may respond
to such an event by changing its graphical representation accordingly.

leave of the pointer position. Similar to focus events, leave events are artificially gen-
erated by GUI servers to indicate a lost pointer focus.

Use cases

• A GUI server provides an event service as an interface for supplying user input
to the GUI.

• Drivers for user-input devices play the roles of event clients.

• Merging multiple streams of user input into one stream, using an event filter com-
ponent.

• Virtual input devices can be realized as event clients that inject artificial input
events to a GUI server or an event filter.

113
4.5 Common session interfaces

GUI session
Input stream

Framebuffer

Views

Figure 35: A GUI session aggregates a virtual framebuffer, an input stream, and a session-local
view stack.

4.5.5. Capture

The capture session interface enables a client to obtain pixel data from a server. For
example, a framebuffer driver plays the role of a capture client that obtains the pixel
data to be displayed on screen from a GUI server.
The pixel data is communicated via a dataspace shared between server and client.
The client (the driver) requests information about so-called dirty areas via periodic
RPC calls from the server. The period of those calls is controlled by the driver and
may ideally correspond to the physical screen refresh (vblank) rate, e. g., 60 times per
second. Based on the returned dirty-area information, the client flushes the pixels from
the shared buffer to the output device.

Use cases

• A framebuffer driver captures the pixels from a GUI server.

• A remote-desktop server application captures the screen of a GUI server.

4.5.6. GUI

The GUI session interface combines a virtual framebuffer and an input stream into a
session. Technically, both the framebuffer and the input stream are aggregated as two
distinct RPC interfaces as depicted in Figure 35. The input interface allows the client
to obtain user-input events whereas the framebuffer interface is used for pixel output.
Furthermore, the GUI session supplements the framebuffer with the notion of views,
which allows for the creation of flexible multi-window user interfaces.

114
4.5 Common session interfaces

Framebuffer interface The GUI client obtains access to the framebuffer as a datas-
pace, which is shared between client and server. The client may update the pixels
within the dataspace at any time. Once a part of the framebuffer has been updated,
the client informs the server by calling a refresh RPC function. Thereby, the framebuffer
session interface employs a synchronous bulk transfer mechanism (Section 3.6.5). To
enable GUI clients to synchronize their operations with the refresh rate of the display,
a client can register a handler for receiving display-synchronization events as asyn-
chronous notifications (Section 3.6.2).

View stack A view is a rectangular area on screen that displays a portion of the
client’s virtual framebuffer. The position, size, and viewport of each view is defined
by the client. Views can overlap, thereby creating a view stack. The stacking order of
the views of one client can be freely defined by the client.
The size of the virtual framebuffer can be freely defined by the client but the required
backing store must be provided in the form of session quota. Clients may request the
screen mode of the physical framebuffer and are able to register a signal handler for
mode changes of the physical framebuffer. This way, GUI clients are able to adapt
themselves to changing screen resolutions.

Use cases

• The nitpicker GUI server allows multiple GUI applications to share the same
framebuffer and input devices in a secure way.

• A window manager implementing the GUI session interface represents each view
as a window with window decorations and a placement policy. The resizing of a
window by the user is reflected to the client as a screen-mode change.

4.5.7. Platform

The platform session interface provides the client with access to the devices present on
the hardware platform and assigned to the client. One platform session may comprise
multiple devices. See Section 4.1.1 for more information about the role of the platform
driver.

4.5.8. Pin state and control

The pin-state and pin-control session interfaces are designated for interacting with
general-purpose I/O (GPIO) pins. Each session corresponds to an individual pin. A
pin-state client is able to monitor the state of an input pin whereas a pin-control client
can define the digital signal level of an output pin. Even though a client of a pin session
is able to interact with a pin, it has no authority over system-critical pin configurations

115
4.5 Common session interfaces

nor does it need to refer to any physical properties of the pin like the GPIO bank or pin
number.
The assignment of sessions to physical pins is in the hands of the pin driver and de-
fined by the pin-driver’s configuration. As each pin corresponds to a separate session,
per-pin access control is naturally attained by Genode’s regular session-routing and
server-side policy-selection paradigms. The pin-driver’s policy maps the session labels
of its clients to physical pins and guards the configuration the physical pins regarding
the I/O direction, pull up/down, or special-function selection.
A pin driver usually provides an IRQ service in addition to the pin-state and pin-
control services. This IRQ service allows a client to receive notifications of a change of
the signal level of a GPIO pin.

4.5.9. Block

The block session interface allows a client to access a storage server at the block level.
The interface is based on a packet stream (Section 3.6.6). Each packet represents a block-
access command, which can be either read or write. Thanks to the use of the packet-
stream mechanism, the client can issue multiple commands at once and thereby hide
access latencies by submitting batches of block requests. The server acknowledges each
packet after completing the corresponding block-command operation.
The packet-stream interface for submitting commands is complemented by the info
RPC function for querying the properties of the block device, i. e., the supported oper-
ations, the block size, and the block count. Furthermore, a client can call the sync RPC
function to flush caches at the block server.

Session creation At session-creation time, the client can dimension the size of the
communication buffer as session argument. The server allocates the shared communi-
cation buffer from the session quota.

Use cases

• Block-device drivers implement the block-session interface.

• The part-block component requests a single block session, parses a partition table,
and hands out each partition as a separate block session to its clients. There can
be one client for each partition.

• File-system servers use block sessions as their back end.

4.5.10. Timer

The timer session interface provides a client with a session-local time source. A client
can use it to schedule timeouts that are delivered as signals to a previously registered

116
4.5 Common session interfaces

signal handler. Furthermore, the client can request the elapsed number of milliseconds
since the creation of the timer session.

4.5.11. NIC

A NIC session represents a network interface that operates at network-packet level.


Each session employs two independent packet streams (Section 3.6.6), one for receiving
network packets and one for transmitting network packets. Furthermore, the client can
query the MAC address of the network interface.

Session creation At session-creation time, the communication buffers of both packet


streams are dimensioned via session arguments. The communication buffers are allo-
cated by the server using the session quota provided by the client.

Use cases

• A NIC router provides multiple virtual NIC interfaces to its clients by managing
a custom name space of virtual MAC addresses.

• A TCP/IP stack uses a NIC session as back end.

4.5.12. Uplink

An uplink session is similar a NIC session with the difference that the roles of the end
points are swapped. An uplink client is the one that provides a network interface (for
instance, a NIC driver) whereas an uplink server is the one that uses that network
interface.
In contrast to the NIC session, the MAC address and link state are defined by the
client. The link state is reflected through the lifetime of an uplink session: The client
requests the session only when the link state is up and closes it whenever the link state
becomes down again. The MAC address is transmitted from the client to the server as
a session-construction argument.

Use cases

• A network driver connects to a NIC router to appear as uplink.

4.5.13. Audio recording and playing

The record and play session interfaces enable audio-processing components to stream
audio data across component boundaries. A record session is used to obtain audio
whereas a play session is used to submit generated audio data. Both session interfaces
use shared memory for the transfer of audio data. The services are typically provided

117
4.5 Common session interfaces

by a mixer component. The mixer routes and mixes audio signals produced by play
clients to record clients according to its configuration. Typical play clients are an audio
player or a microphone driver whereas typical record clients are an audio recorder or
an audio-output driver. Note that audio drivers as well as audio applications are mere
clients of the mixer. This architecture allows for the dynamic starting, removal, and
restarting of a driver, of even multiple drivers.
Both play and record clients are expected to operate periodically. The number of
samples produced per period is up to each client and does not need to be constant over
time. The mixer infers the used sample rates and periods by observing the behavior of
the clients. Sample rates between play and record clients are converted automatically.
Multi-channel playing and recording are realized by one session per channel whereas
one channel is used to drive the time allocation while all further channels merely en-
queue/obtain data into/from their respective sessions without any synchronous inter-
play with the mixer.

Session construction At session-construction time, the client specifies the type of


channel (e. g., “left”) as session argument.

Use cases

• The record and play session interfaces are provided by an audio mixer.

• An audio-output driver obtains sample data from the mixer using two record
sessions for the left and right channels respectively, and converts the sample data
into the format expected by the audio device.

• An audio-input driver submits data sampled from the microphone to the mixer
via a play session.

• A multi-channel audio player uses one play session for each channel.

• A graphical oscilloscope uses a record session to get hold of the data to display.

4.5.14. File system

The file-system session interface provides the client with a storage facility at the file
and directory-level. Compared to the block session interface (Section 4.5.9), it operates
on a higher abstraction level that is suited for multiplexing the storage device among
multiple clients. Similar to the block session, the file-system session employs a single
packet stream interface (Section 3.6.6) for issuing read and write operations. This way,
read and write requests can be processed in batches and even out of order.

118
4.5 Common session interfaces

In contrast to read and write operations that carry potentially large amounts of pay-
load, the directory functions provided by the file-system session interface are syn-
chronous RPC functions. Those functions are used for opening, creating, renaming,
moving, deleting, and querying files, directories and symbolic links.
The directory functions are complemented with an interface for receiving notifica-
tions upon file or directory changes using asynchronous notifications.

Use cases

• A file-system operates on a block session to provide file-system sessions to its


clients.

• A RAM file system keeps the directory structure and files in memory and pro-
vides file-system sessions to multiple clients. Each session may be restricted in
different ways (such as the root directory as visible by the respective client, or
the permission to write). Thereby the clients can communicate using the RAM
file system as a shared storage facility but are subjected to an information-flow
policy.

• A file-system component may play the role of a filter that transparently encrypts
the content of the files of its client and stores the encrypted files at another file-
system server.

• A pseudo file system may use the file-system interface as an hierarchic control
interface. For example, a trace file system provides a pseudo file system as a front
end to interact with core’s TRACE service.

119
4.6 Component configuration

4.6. Component configuration

By convention, each component obtains its configuration in the form of a ROM module
named “config”. The ROM session for this ROM module is provided by the parent of
the component. For example, for the init component, which is the immediate child of
core, its “config” ROM module is provided by core’s ROM service. Init, in turn, pro-
vides a different config ROM module to each of its children by a locally implemented
ROM service per child.

4.6.1. Configuration format

In principle, being a mere ROM module, a component configuration can come in an


arbitrary format. However, throughout Genode, there exists the convention to use XML
as syntax and wrap the configuration within a <config> node. The definition of sub
nodes of the configuration depends on the respective component.

4.6.2. Server-side policy selection

Servers that serve multiple clients may apply a different policy to each client. In gen-
eral, the policy may be defined by the session arguments aggregated on the route of
the session request as explained in Section 3.2.3. However, in the usual case, the pol-
icy is dictated by the common parent of client and server. In this case, the parent may
propagate its policy as the server’s configuration and deliver a textual label as session
argument for each session requested at the server. The configuration contains a list of
policies whereas the session label is used as a key to select the policy from the list. For
example, the following snippet configures a RAM file system with different policies.

<config>
<!-- constrain sessions according to their labels -->
<policy label="shell -> root" root="/" />
<policy label="shell -> home" root="/home/user" />
<policy label="shell -> tmp" root="/tmp" writeable="yes" />
</config>

Each time a session is created, the server matches the supplied session label against
the configured policies. Only if a policy matches, the parameters of the matching pol-
icy come into effect. The way how the session label is matched against the policies
depends on the implementation of the server. However, by convention, servers usually
select the policy depending on the attributes label, label_prefix, and label_suffix.
If present, the label attribute must perfectly match the session label whereby the suf-
fix and prefix counterparts allow for partially matching the session label. If multiple
<policy> nodes match at the server side, the most specific policy is selected. Exact
matches are considered as most specific, prefixes as less specific, and suffixes as least

120
4.6 Component configuration

specific. If multiple prefixes or suffixes match, the longest is considered as the most
specific. If multiple policies have the same label, the selection is undefined. This is a
configuration error.

4.6.3. Dynamic component reconfiguration at runtime

As described in Section 4.5.1, a ROM module can be updated during the lifetime of
the ROM session. This principally enables a parent to dynamically reconfigure a child
component without the need to restart it. If a component supports its dynamic re-
configuration, it installs a signal handler at its “config” ROM session. Each time, the
configuration changes, the component will receive a signal. It responds to such a signal
by obtaining the new version of the ROM module using the steps described in Section
4.5.1 and applying the new configuration.

121
4.7 Component composition

4.7. Component composition

Genode provides a playground for combining components in many different ways. The
best composition of components often depends on the goal of the system integrator.
Among possible goals are the ease of use for the end user, the cost-efficient reuse of
existing software, and good application performance. However, the most prominent
goal is the mitigation of security risks. This section presents composition techniques
that leverage Genode’s architecture to dramatically reduce the trusted computing base
of applications and to solve rather complicated problems in surprisingly easy ways.
The figures presented throughout this section use a simpler nomenclature than the
previous sections. A component is depicted as box. Parent-child relationships are rep-
resented as light-gray arrows. A session between a client and a server is illustrated by
a dashed arrow pointing to the server.

Server Client

Parent

4.7.1. Sandboxing

The functionality of existing applications and libraries is often worth reusing or eco-
nomically downright infeasible to reimplement. Examples are PDF rendering engines,
libraries that support commonly used video and audio codecs, or libraries that decode
hundreds of image formats.
However, code of such rich functionality is inherently complex and must be assumed
to contain security flaws. This is empirically evidenced by the never ending stream of
security exploits targeting the decoders of data formats. But even in the absence of
bugs, the processing of data by third-party libraries may have unintended side effects.
For example, a PDF file may contain code that accesses the file system, which the user of
a PDF reader may not expect. By linking such a third-party library to a security-critical
application, the application’s security is seemingly traded against the functional value
that the library offers.
Fortunately, Genode’s architecture principally allows every component to encapsu-
late untrusted functionality in child components. So instead of directly linking a third-
party library to an application, the application executes the library code in a dedicated
sub component. By imposing a strict session-routing policy onto the component, the
untrusted code is restricted to its sandbox. Figure 36 shows a video player as a practical
example of this approach.
The video player uses the nitpicker GUI server to present a user interface with the
graphical controls of the player. Furthermore, it has access to a media file containing

122
4.7 Component composition

Codec (avplay)

SDL audio SDL video

ROM Nitpicker GUI server


ROM
"config"
"media"
<config>
<sdl_audio_volume GUI
value="85"/>
</config>

Virtual Virtual
GUI
ROM GUI

GUI
ROM GUI
service

Figure 36: A video player executes the video and audio codecs inside a dedicated sandbox.

video and audio data. Instead of linking the media-codec library (libav) directly to
the video-player application, it executes the codec as a child component. Thereby the
application effectively restricts the execution environment of the codec to only those
resources that are needed by the codec. Those resources are the media file that is handed
out to the codec as a ROM module, a facility to output video frames in the form of a GUI
session, and a facility to output an audio stream in the form of an audio-out session.
In order to reuse as much code as possible, the video player executes an existing
example application called avplay that comes with the codec library as child compo-
nent. The avplay example uses libSDL as back end for video and audio output and
responds to a few keyboard shortcuts for controlling the video playback such as paus-
ing the video. Because there exists a Genode version of libSDL, avplay can be executed
as a Genode component with no modifications. This version of libSDL requests a GUI
session (Section 4.5.6) and an audio-out session (Section [?]) to perform the video and
audio output and respond to user input. Furthermore, it opens a ROM session for ob-
taining a configuration. This configuration parametrizes the audio back end of libSDL.
Because avplay is a child of the video-player application, all those session requests are
directed to the application. It is entirely up to the application how to respond to those
requests. For accommodating the request for a GUI session, the application creates a
second GUI session, configures a virtual framebuffer, and embeds this virtual frame-
buffer into its GUI. It keeps the GUI session capability for itself and merely hands out
the virtual GUI’s session capability to avplay. For accommodating the request for the
input stream, it hands out a capability to a locally-implemented input stream. Using

123
4.7 Component composition

this input stream, it becomes able to supply artificial input events to avplay. For exam-
ple, when the user clicks on the play button of the application’s GUI, the application
would submit a sequence of press and release events to the input stream, which appear
to avplay as the keyboard shortcut for starting the playback. To let the user adjust the
audio parameters of libSDL during playback, the video-player application dynamically
changes the avplay configuration using the mechanism described in Section 4.6.3. As
a response to a configuration update, libSDL’s audio back end picks up the changed
configuration parameters and adjusts the audio playback accordingly.
By sandboxing avplay as a child component of the video player, a bug in the video or
audio codecs can no longer compromise the application. The execution environment of
avplay is tailored to the needs of the codec. In particular, it does not allow the codec to
access any files or the network. In the worst case, if avplay becomes corrupted, the pos-
sible damage is restricted to producing wrong video or audio frames but a corrupted
codec can neither access any of the user’s data nor can it communicate to the outside
world.

4.7.2. Component-level and OS-level virtualization

The sandboxing technique presented in the previous section tailors the execution en-
vironment of untrusted third-party code by applying an application-specific policy to
all session requests originating from the untrusted code. However, the tailoring of the
execution environment by the parent can even go a step further by providing the all-
encompassing virtualization of all services used by the child, including core’s services
such as PD, CPU, and LOG. This way, the parent can not just tailor the execution envi-
ronment of a child but completely define all aspects of the child’s execution. This clears
the way for introducing custom operating-system interfaces at any position within the
component tree, or for monitoring the behavior of subsystems.

Introducing a custom OS interface By intercepting all session interfaces normally


provided by core, a runtime environment becomes able to handle all low-level inter-
actions of the child with core. This includes the allocation of memory using the PD
service, the spawning and controlling of threads using the CPU service, and the man-
agement of the child’s address space using the PD service.
This flexibility paves the ground for hosting traditional operating-system interfaces
such as Unix as a mere user-level construct within a Genode system. Normally, several
aspects of Unix would contradict with Genode’s architecture:

• The Unix system-call interface supports files and sockets as first-level citizens.

• There is no global virtual file system in Genode.

124
4.7 Component composition

Vim

VFS

VFS server bash


ROM vim.tar
ROM bash.tar VFS fs_rom VFS
Terminal

Init

Figure 37: Runtime environment that provides a Unix-like interface

• Any Unix process can allocate memory as needed. There is no necessity for ex-
plicit assignment of memory resources to Unix processes. In contrast, Genode
employs the rigid accounting of physical resources as explained in Section 3.3.

• Processes are created by forking existing processes. The new process inherits the
roles (in the form of open file descriptors) of the forking process.

Figure 37 illustrates a custom Unix runtime environment that bridges these gaps by
using Genode’s building blocks.

1. The VFS server is able mount TAR archives locally as a virtual file system and
offers the content as a file-system service. Furthermore, the VFS server exposes a
terminal session as a pseudo file. In the depicted scenario, the terminal session
request is routed to the parent of init.

2. The fs_rom component provides a ROM service by fetching the content of ROM
modules from a file system. By connecting the fs_rom with the VFS component,
the files of the bash.tar and vim.tar archives become available as ROM modules.
With the bash executable binary accessible as ROM module, it can be executed as
a Genode component.

3. The init component allows one to stick components together and let the result
appear to the surrounding system as a single component. It is used to host the
composition of the VFS, fs_rom, and bash.

4. The bash shell can spawn child processes such as Vim by relying of traditional
Unix interfaces, namely fork and execve. In contrast to regular Unix systems,
however, the underlying mechanisms are implemented as part of the C runtime
with no kernel support or special privileges needed. In the depicted scenario,

125
4.7 Component composition

Component Service

ROM PD CPU Core

Figure 38: Each Genode component is created out of basic resources provided by core.

bash plays the role of a runtime environment for Vim. Since bash is a parent of
Vim, it is able to respond to Vim’s resource demands by paying out of its own
pocket, thereby softening Genode’s rigid resource accounting to accommodate
Vim’s expectations.

Monitoring the behavior of subsystems Besides hosting arbitrary OS personalities


as a subsystem, the interception of core’s services allows for the all-encompassing mon-
itoring of subsystems without the need for special support in the kernel. This is useful
for failsafe monitoring or for user-level debugging.
As described in Section 3.5, any Genode component is created out of low-level re-
sources in the form of sessions provided by core. Those sessions include at least a PD
session, a CPU session, and a ROM session with the executable binary as depicted in
Figure 38. In addition to those low-level sessions, the component may interact with
sessions provided by other components.
For debugging a component, a debugger would need a way to inspect the internal
state of the component. As the complete internal state is usually known by the OS
kernel only, the traditional approach to user-level debugging is the introduction of a
debugging interface into the kernel. For example, Linux has the ptrace mechanism
and several microkernels of the L4 family come with built-in kernel debuggers. Such
a debugging interface, however, introduces security risks. Besides increasing the com-
plexity of the kernel, access to the kernel’s debugging mechanisms needs to be strictly
subjected to a security policy. Otherwise any program could use those mechanisms
to inspect or manipulate other programs. Most L4 kernels usually exclude debugging
features in production builds altogether.
In a Genode system, the component’s internal state is represented in the form of core
sessions. Hence, by intercepting those sessions of a child, a parent can monitor all in-
teractions of the child with core and thereby record the child’s internal state. Figure 39
shows a scenario where a debug monitor executes a component (debugging target) as a
child while intercepting all sessions to core’s services. The interception is performed by
providing custom implementations of core’s session interfaces as locally implemented

126
4.7 Component composition

Component
as Debugging Target

GDB
ROM PD CPU Service Terminal
Monitor

ROM PD CPU Core GNU


Debugger

Figure 39: By intercepting all sessions to core’s services, a debug monitor obtains insights into
the internal state of its child component. The debug monitor, in turn, is controlled
from a remote debugger.

services. Under the hood, the local services realize their functionality using actual core
sessions. But by sitting in the middle between the debugging target and core, the de-
bug monitor can observe the target’s internal state including the memory content, the
virtual address-space layout, and the state of all threads running inside the component.
Furthermore, since the debug monitor is in possession of all the session capabilities of
the debugging target, it can manipulate it in arbitrary ways. For example, it can change
thread states (e. g., pausing the execution or enable single-stepping) and modify the
memory content (e. g., inserting breakpoint instructions). The figure shows that those
debugging features can be remotely controlled over a terminal connection.
Using this form of component-level virtualization, a problem that used to require
special kernel additions in traditional operating systems can be solved via Genode’s
regular interfaces.

4.7.3. Interposing individual services

The design of Genode’s fundamental services, in particular resource multiplexers, is


guided by the principle of minimalism. Because such components are security criti-
cal, complexity must be avoided. Functionality is added to such components only if it
cannot be provided outside the component.
However, components like the nitpicker GUI server are often confronted with feature
requests. For example, users may want to move a window on screen by dragging the
window’s title bar. Because nitpicker has no notion of windows or title bars, such func-
tionality is not supported. Instead, nitpicker moves the burden to implement window
decorations to its clients. However, this approach sacrifices functionality that is taken
for granted on modern graphical user interfaces. For example, the user may want to
switch the application focus using a keyboard shortcut or perform window operations

127
4.7 Component composition

Launchpad
Panel
Nitpicker Window Browser
Pointer
GUI Manager
Backdrop

Init

Figure 40: The nitpicker GUI accompanied with a window manager that interposes the nit-
picker session interface for the applications on the right. The applications on the
left are still able to use nitpicker directly and thereby avoid the complexity added by
the window manager.

and the interactions with virtual desktops in a consistent way. If each application im-
plemented the functionality of virtual desktops individually, the result would hardly
be usable. For this reason, it is tempting to move window-management functionality
into the GUI server and to accept the violation of the minimalism principle.
The nitpicker GUI server is not the only service challenged by feature requests. The
problem is present even at the lowest-level services provided by core. Core’s region-
map mechanism is used to manage the virtual address spaces of components via their
respective PD sessions. When a dataspace is attached to a region map, the region map
picks a suitable virtual address range where the dataspace will be made visible in the
virtual address space. The allocation strategy depends on several factors such as align-
ment constraints and the address range that fits best. But eventually, it is deterministic.
This contradicts the common wisdom that address spaces shall be randomized. Hence
core’s PD service is challenged with the request for adding address-space randomiza-
tion as a feature. Unfortunately, the addition of such a feature into core raises two
issues. First, core would need to have a source of good random numbers. But core does
not contain any device drivers where to draw entropy from. With weak entropy, the
randomization might be not random enough. In this case, the pretension of a security
mechanism that is actually ineffective may be worse than not having it in the first place.
Second, the feature would certainly increase the complexity of core. This is acceptable
for components that potentially benefit from the added feature, such as outward-facing
network applications. But the complexity eventually becomes part of the TCB of all
components including those that do not benefit from the feature.
The solution to those kind of problems is the enrichment of existing servers by inter-
posing their sessions. Figure 40 shows a window manager implemented as a separate
component outside of nitpicker. Both the nitpicker GUI server and the window man-
ager provide the nitpicker session interface. But the window manager enriches the
semantics of the interface by adding window decorations and a window-layout policy.

128
4.7 Component composition

Under the hood, the window manager uses the real nitpicker GUI server to implement
its service. From the application’s point of view, the use of either service is transparent.
Security-critical applications can still be routed directly to the nitpicker GUI server. So
the complexity of the window manager comes into effect only for those applications
that use it.
The same approach can be applied to the address-space randomization problem. A
component with access to good random numbers may provide a randomized version
of core’s PD service. Outward-facing components can benefit from this security feature
by having their PD session requests routed to this component instead of core.

4.7.4. Ceding the parenthood

When using a shell to manage subsystems, the complexity of the shell naturally be-
comes a security risk. A shell can be a text-command interpreter, a graphical desktop
shell, a web browser that launches subsystems as plugins, or a web server that pro-
vides a remote administration interface. What all those kinds of shells have in common
is that they contain an enormous amount of complexity that can be attributed to conve-
nience. For example, a textual shell usually depends on libreadline, ncurses, or similar
libraries to provide a command history and to deal with the peculiarities of virtual text
terminals. A graphical desktop shell is even worse because it usually depends on a
highly complex widget toolkit, not to mention using a web browser as a shell. Un-
fortunately, the functionality provided by these programs cannot be dismissed as it is
expected by the user. But the high complexity of the convenience functions funda-
mentally contradicts the security-critical role of the shell as the common parent of all
spawned subsystems. If the shell gets compromised, all the spawned subsystems will
suffer.
The risk of such convoluted shells can be mitigated by moving the parent role for
the started subsystems to another component, namely a loader service. In contrast to
the shell, which should be regarded as untrusted due it its complexity, the loader is
a small component that is orders of magnitude less complex. Figure 41 shows a sce-
nario where a web browser is used as a shell to spawn a Genode subsystem. Instead
of spawning the subsystem as the child of the browser, the browser creates a session
to a trusted low-complexity loader service. The loader allows its client to import the
to-be-executed subsystem into the loader session and kick off the execution of the sub-
system. However, once the subsystem is running, the browser can no longer interfere
with the subsystem’s operation. So security-sensitive information processed within the
loaded subsystem are no longer exposed to the browser. Still, the lifetime of the loaded
subsystem depends on the browser. If it decides to close the loader session, the loader
will destroy the corresponding subsystem.
By ceding the parenthood to a trusted component, the risks stemming from the com-
plexity of various kinds of shells can be mitigated.

129
4.7 Component composition

App

Web
Virtual Browser
Launcher Nitpicker
Framebuffer
TCP/IP

Init

Nitpicker
Loader Launcher
GUI

Init

Figure 41: A web browser spawns a plugin by ceding the parenthood of the plugin to the trusted
loader service.

4.7.5. Publishing and subscribing

All the mechanisms for transferring data between components presented in Section
3.6 have in common that data is transferred in a peer-to-peer fashion. A client trans-
fers data to a server or vice versa. However, there are situations where such a close
coupling of both ends of communication is not desired. In multicast scenarios, the
producer of information desires to propagate information without the need to interact
(or even depend on a handshake) with each individual recipient. Specifically, a compo-
nent might want to publish status information about itself that might be useful for other
components. For example, a wireless-networking driver may report the list of detected
wireless networks along with their respective SSIDs and reception qualities such that
a GUI component can pick up the information and present it to the user. Each time,
the driver detects a change in the ether, it wants to publish an updated version of the
list. Such a scenario could principally be addressed by introducing a use-case-specific
session interface, i. e., a “wlan-list” session. But this approach has two disadvantages.

1. It forces the wireless driver to play an additional server role. Instead of pushing
information anytime at the discretion of the driver, the driver has to actively sup-
port the pulling of information from the wlan-list client. This is arguably more
complex.

2. The wlan-list session interface ultimately depends on the capabilities of the driver
implementation. If an alternative wireless driver is able to supplement the list

130
4.7 Component composition

with further details, the wlan-list session interface of the alternative driver might
look different. As a consequence, the approach is likely to introduce many special-
purpose session interfaces. This contradicts with the goal to promote the compos-
ability of components as stated at the beginning of Section 4.5.
As an alternative to introducing special-purpose session interfaces for addressing the
scenarios outlined above, two existing session interfaces can be combined, namely
ROM and report.

Report-ROM server The report-rom server is both a ROM service and a report service.
It acts as an information broker between information providers (clients of the report
service) and information consumers (clients of the ROM service).
To propagate its internal state to the outside, a component creates a report session.
From the client’s perspective, the posting of information via the report session’s submit
function is a fire-and-forget operation, similar to the submission of a signal. But in
contrast to a signal, which cannot carry any payload, a report is accompanied with
arbitrary data. For the example above, the wireless driver would create a report session.
Each time, the list of networks changes, it would submit an updated list as a report to
the report-ROM server.
The report-ROM server stores incoming reports in a database using the client’s ses-
sion label as key. Therefore, the wireless driver’s report will end up in the database un-
der the name of the driver component. If one component wishes to post reports of dif-
ferent kinds, it can do so by extending the session label by a component-provided label
suffix supplied as session-construction argument (Section 4.5.2). The memory needed
as the backing store for the report at the report-ROM server is accounted to the report
client via the session-quota mechanism described in Section 3.3.2.
In its role of a ROM service, the report-ROM server hands out the reports stored in its
database as ROM modules. The association of reports with ROM sessions is based on
the session label of the ROM client. The configuration of the report-ROM server con-
tains a list of policies as introduced in Section 4.6.2. Each policy entry is accompanied
with a corresponding key into the report database.
When a new report comes in, all ROM clients that are associated with the report are
informed via a ROM-update signal (Section 4.5.1). Each client can individually respond
to the signal by following the ROM-module update procedure and thereby obtain the
new version of the report. From the client’s perspective, the origin of the information
is opaque. It cannot decide whether the ROM module is provided by the report-ROM
server or an arbitrary other ROM service.
Coming back to the wireless-driver example, the use of the report-ROM server effec-
tively decouples the GUI application from the wireless driver. This has the following
benefits:
• The application can be developed and tested with an arbitrary ROM server sup-
plying an artificially created list of networks.

131
4.7 Component composition

Managed Managed
...
Child Child

state (1)

Dynamic
Manager
Init

config (2)

Figure 42: The combination of the dynamic re-configuration with the state reporting of an init
instance forms a feedback-control system.

• There is no need for the introduction of a special-purpose session interface be-


tween both components.

• The wireless driver can post state updates in an intuitive fire-and-forget way
without playing an additional server role.

• The wireless driver can be restarted without affecting the application.

Poly-instantiation of the report-ROM mechanism The report-ROM server is a


canonical example of a protocol stack (Section 4.2). It performs a translation between
the report-session interface and the ROM-session interface. Being a protocol stack, it
can be instantiated any number of times. It is up to the system integrator whether
to use one instance for gathering the reports of many report clients, or to instantiate
multiple report-ROM servers. Taken to the extreme, one report-ROM server could be
instantiated per report client. The routing of ROM-session requests restricts the access
of the ROM clients to the different instances. Even in the event that the report-ROM
server is compromised, the policy for the information flows between the producers and
consumers of information stays in effect.

4.7.6. Feedback control system

By combining the techniques presented in Sections 4.7.4 and 4.7.5, a general pattern of
a feedback-control system emerges (Figure 42).
This pattern achieves a strict separation of policy from functionality by employing a
dynamically configured init component (dynamic init) in tandem with a management
component (manager). The manager (1) monitors the state of the dynamic init and its
children, and (2) feeds the dynamic init with configurations. Both the dynamic init and
the manager are siblings within another (e. g., the static initial) init instance. The state
report and the config ROM are propagated via the report-ROM component as presented

132
4.7 Component composition

in Section 4.7.5. Hence, the manager and the dynamic init are loosely coupled. There is
no client-server dependency in either direction.
The init component supports the reporting of its current state including the state of
all children. Refer to Section 6.2.10 for an overview of the reporting options. The report
captures, among other things, the resource consumption of each child. For example,
should a child overstep its resource boundaries, the report’s respective <child> node
turns into this:

<state>
..
<child name="system-shell" ...>
<ram ... requested="2M"/>
..
</child>
</state>

If the requested attribute is present, the child got stuck in a resource request. In
the example above, the “system-shell” asks for 2M of additional memory. To resolve
this situation, the manager can generate a new configuration for the dynamic init. In
particular, it can

• Adjust the resource quota of the resource-starved child. When the dynamic init
observes such a configuration change, it answers the resource request and thereby
prompts the child to continue its execution.

• Restart the child by incrementing a version attribute of the child node. Once the
dynamic init observes a change of this attribute, the child is killed and restarted.

In addition to responding to resource requests, the manager component can also eval-
uate other parts of the report. The two most interesting bits of information are the exit
state of each child (featuring the exit code) and the health. The health is the child’s
ability to respond to external events. It is described in more detail at Section 6.2.13. It
effectively allows the manager to implement watchdog functionality.
In contrast to the potentially highly complex child hosted within the dynamic init,
the manager component is much less prone to bugs. All it does is consuming reports
(parsing some XML) and generating configurations (generating XML). It does not not
need any C runtime, file system, or I/O drivers. It can be implemented without any
dynamic memory allocations. In contrast to the complex child, which can just expected
to be flaky, the manager and the dynamic init are supposed to be correct and trustwor-
thy. Since the functionality of the dynamic init is present in the regular init component,
which is part of any Genode system’s trusted computing base anyway, the feedback-
control pattern does not add new critical code complexity.

133
4.7 Component composition

Examples

• The sequential execution of multiple automated tests integrated in single system


scenario requires the monitoring and parsing of test results and the orchestration
of the test sequence. The depot-autopilot scenario at gems/run/depot_autopilot.run
solves these problems by applying the described pattern.

• The combination of a dynamic init with a manager can be found in advanced test
scenarios such as the init test (os/recipes/pkg/test-init).

• To enable one system image to run on a variety of hardware configurations, the


dynamic probing of devices and starting of appropriate device drivers is needed.
The device-driver subsystem of Sculpt OS solves this problem with the manager
component gems/src/app/driver_manager/. For example, the driver manager eval-
uates the information reported by the PCI-bus driver to conditionally start the
most suitable graphics driver.

• For the on-target installation of packages, the so-called depot-download subsys-


tem uses a dynamic init that successively downloads, verifies, extracts, and parses
archives. The manager of this subsystem is located at gems/src/app/depot_down-
load_manager/.

• The arguably most sophisticated example is the sculpt manager that controls the
runtime environment of the Sculpt operating system. It is located at gems/src/ap-
p/sculpt_manager/.

134
5. Development

The Genode OS framework is accompanied by a scalable build system and tooling in-
frastructure that is designed for the creation of highly modular and portable systems
software. Understanding the underlying concepts is important for leveraging the full
potential of the framework. This chapter complements Chapter 2 with the explanation
of the coarse-grained source-tree structure (Section 5.1), the integration of 3rd-party
software (Section 5.2), the build system (Section 5.3), and system-integration tools (Sec-
tion 5.4). Furthermore, it describes the project’s development process in Section 5.7.

135
5.1 Source-code repositories

5.1. Source-code repositories

As briefly introduced in Section 2.2, Genode’s source tree is organized in the form of
several source-code repositories. This coarse-grained modularization of the source code
has the following benefits:

• Source codes of different concerns remain well separated. For example, the
platform-specific code for each base platform is located in a dedicated base-<platform>
repository.

• Different abstraction levels and features of the system can be maintained in differ-
ent source-code repositories. Whereby the source code contained in the os reposi-
tory is free from any dependency from 3rd-party software, the components hosted
in the libports repository are free to use foreign code.

• Custom developments and experimental features can be hosted in dedicated


source-code repositories, which do not interfere with Genode’s source tree. Such
a custom repository can be managed independently from Genode using arbitrary
revision-control systems.

The build-directory configuration defines the set of repositories to incorporate into the
build process. At build time, the build system overlays the directory structures of all
selected repositories to form a single logical source tree. The selection of source-code
repositories ultimately defines the view of the build system on the source tree.
Note that the order of the repositories as configured in the build configuration (in
etc/build.conf ) is important. Front-most repositories shadow subsequent repositories.
This makes the repository mechanism a powerful tool for tweaking existing reposito-
ries: By adding a custom repository in front of another one, customized versions of
single files (e. g., header files or target description files) can be supplied to the build
system without changing the original repository.
Each source-code repository has the principle structure shown in Table 1.

136
5.1 Source-code repositories

Directory Description
doc/ Documentation, specific for the repository
etc/ Default configuration for the build system
mk/ Build-system supplements
include/ Globally visible header files
src/ Source codes and target build descriptions
lib/mk/ Library build descriptions
lib/import/ Library import descriptions
lib/symbols/ Symbol lists provided by shared libraries
ports/ Port descriptions of 3rd-party software
recipes/ Package descriptions for depot content
run/ System scenarios in the form of run scripts

Table 1: Structure of a source-code repository. Depending on the repository, only a subset of


those directories may be present.

137
5.2 Integration of 3rd-party software

5.2. Integration of 3rd-party software

Downloaded 3rd-party source code resides outside of the actual repository at the cen-
tral <genode-dir>/contrib/ directory. This structure has the following benefits over host-
ing 3rd-party source code along with Genode’s genuine source code:

• Working with grep within the repositories works very efficient because down-
loaded and extracted 3rd-party code is not in the way. Such code resides next to
the repositories.

• Storing all build directories and downloaded 3rd-party source code somewhere
outside the Genode source tree, e. g., on different disk partitions, can be easily
accomplished by creating symbolic links for the build/ and contrib/ directories.

The contrib/ directory is managed using the tools at <genode-dir>/tool/ports/.

Obtain a list of available ports

tool/ports/list

Download and install a port

tool/ports/prepare_port <port-name>

The prepare_port tool scans all repositories under repos/ for the specified port and installs
the port into contrib/. Each version of an installed port resides in a dedicated subdirec-
tory within the contrib/ directory. The port-specific directory is called port directory. It
is named <port-name>-<fingerprint>. The <fingerprint> uniquely identifies the version
of the port (it is a SHA256 hash of the ingredients of the port). If two versions of the
same port are installed, each of them will have a different fingerprint. So they end up
in different directories.
Within a source-code repository, a port is represented by two files, a <port-name>.port
and a <port-name>.hash file. Both files reside at the ports/ subdirectory of the corre-
sponding repository. The <port-name>.port file is the port description, which declares
the ingredients of the port, e. g., the archives to download and the patches to apply.
The <port-name>.hash file contains the fingerprint of the corresponding port descrip-
tion, thereby uniquely identifying a version of the port as expected by the checked-out
Genode version.
For step-by-step instructions on how to add a port using the mechanism, please refer
to the porting guide:

Genode Porting Guide


https://genode.org/documentation/developer-resources/porting

138
5.3 Build system

5.3. Build system

5.3.1. Build directories

The build system is supposed to never touch the source tree. The procedure of building
components and integrating them into system scenarios is performed within a distinct
build directory. One build directory targets a specific kernel and hardware platform.
Because the source tree is decoupled from the build directory, one source tree can have
many different build directories associated, each targeted at a different platform.
The recommended way for creating a build directory is the use of the create_builddir
tool located at <genode-dir>/tool/. The tool prints usage information along with a list of
supported base platforms when started without arguments. For creating a new build
directory, one of the listed target platforms must be specified. By default, the new build
directory is created at <genode-dir>/build/<platform>/ where <platform> corresponds to
the specified argument. Alternatively, the default location can be overridden via the
optional BUILD_DIR= argument. For example:

cd <genode-dir>
./tool/create_builddir x86_64 BUILD_DIR=/tmp/build.x86_64

This command creates a new build directory for the 64-bit x86 platform at /tm-
p/build.x86_64/. For the basic operations available from within the build directory,
please refer to Section 2.3.

Configuration Each build directory contains a Makefile, which is a symbolic link to


tool/builddir/build.mk. The makefile is the front end of the build system and not sup-
posed to be edited. Besides the makefile, there is an etc/ subdirectory that contains the
build-directory configuration. For most platforms, there exists merely a single build.conf
file, which defines the source-code repositories to be incorporated into the build process
along with the parameters for the run tool explained in Section 5.4.1.
The selection of source-code repositories is defined by the REPOSITORIES dec-
laration, which contains a list of directories. The etc/build.conf file as found in a
freshly created build directory is preconfigured to select the source-code reposito-
ries base-<platform>, base, os, and demo. There are a number of commented-out lines
that can be uncommented for enabling additional repositories.

Cleaning To remove all but kernel-related generated files, use

make clean

To remove all generated files, use

139
5.3 Build system

make cleanall

Both clean and cleanall won’t remove any files from the bin/ subdirectory. This
makes the bin/ a safe place for files that are unrelated to the build process, yet are re-
quired for the integration stage, e. g., binary data.

Controlling the verbosity To understand the inner workings of the build process in
more detail, you can tell the build system to display each directory change by specifying

make VERBOSE_DIR=

If you are interested in the arguments that are passed to each invocation of make, you
can make them visible via

make VERBOSE_MK=

Furthermore, you can observe each single shell-command invocation by specifying

make VERBOSE=

Of course, you can combine these verboseness toggles for maximizing the noise.

5.3.2. Target descriptions

Each build target is represented by a corresponding target.mk file within the src/ subdi-
rectory of a source-code repository. This file declares the name of the target, the source
codes to be incorporated into the target, and the libraries the target depends on. The
build system evaluates target descriptions using make. Hence, the syntax corresponds
to the syntax of makefiles and the principle functionality of make is available for tar-
get.mk files. For example, it is possible to define custom rules as done in Section 5.3.5.

Target declarations

TARGET is the name of the binary to be created. This is the only mandatory variable to
be defined in each target.mk file.

LIBS is the list of libraries that are used by the target.

SRC_CC contains the list of .cc source files. The default search location for source codes
is the directory where the target.mk file resides.

140
5.3 Build system

SRC_C contains the list of .c source files.

SRC_S contains the list of assembly .s source files.

SRC_BIN contains binary data files to be linked to the target.

INC_DIR is the list of include search locations. Directories should always be appended
by using +=.

REQUIRES expresses the requirements that must be satisfied in order to build the target.
More details about the underlying mechanism is provided by Section 5.3.4.

CC_OPT contains additional compiler options to be used for .c as well as for .cc files.

CC_CXX_OPT contains additional compiler options to be used for the C++ compiler only.

CC_C_OPT contains additional compiler options to be used for the C compiler only.

EXT_OBJECTS is a list of external objects or libraries. This declaration is merely used


for interfacing Genode with legacy software components.

Specifying search locations When specifying search locations for header files via
the INC_DIR variable or for source files via vpath, the use of relative pathnames is
illegal. Instead, the following variables can be used to reference locations within the
source-code repository where the target resides:

REP_DIR is the base directory of the target’s source-code repository. Normally, specify-
ing locations relative to the base of the repository is rarely used by target.mk files
but needed by library descriptions.

PRG_DIR is the directory where the target.mk file resides. This variable is always to be
used when specifying a relative path.

$(call select_from_repositories,path/relative/to/repo) This function re-


turns the absolute path for the given repository-relative path by looking at all
source-code repositories in their configured order. Hereby, it is possible to access
files or directories that are outside the target’s source-code repository.

$(call select_from_ports,<port-name>) This function returns the absolute path


for the contrib directory of the specified <port-name>. The contrib directory is
located at <genode-dir>/contrib/<port-name>-<fingerprint> whereby <fingerprint>
uniquely identifies the version of the port as expected by the current state of the
Genode source tree.

141
5.3 Build system

Custom targets accompanying a library or program There are cases that call for
building custom targets in addition to a regular library or or program. For example,
the executable binary of an application may be accompanied by generated data files.
The creation of such build artifacts can be expressed by custom make rules. However,
a rule is triggered only if it is a dependency of the build target. This can be achieved by
adding the rule to the CUSTOM_TARGET_DEPS variable. For example,

CUSTOM_TARGET_DEPS += menu_view_styles.tar

menu_view_styles.tar:
$(VERBOSE)cd $(PRG_DIR); tar cf $(PWD)/bin/$@ styles

5.3.3. Library descriptions

In contrast to target descriptions that are scattered across the whole source tree, library
descriptions are located at the central place lib/mk. Each library corresponds to a <lib-
name>.mk file. The base of the description file is the name of the library. Therefore,
no TARGET variable needs to be defined. The location of source-code files is usually de-
fined relative to $(REP_DIR). Library-description files support the following additional
declaration:

SHARED_LIB = yes declares that the library should be built as a shared object rather
than a static library. The resulting object will be called <libname>.lib.so.

5.3.4. Platform specifications

Building components for different platforms likely implicates that portions of code are
tied to certain aspects of the target platform. For example, target platforms may differ
in the following respects:

• The API of the used kernel,

• The hardware architecture such as x86, ARMv7,

• Certain hardware facilities such as a custom device, or

• Other considerations such as software license requirements.

Each of those aspects may influence the build process in different ways. The build sys-
tem provides a generic mechanism to steer the build process according to such aspects.
Each aspect is represented by a tag called spec value. Any platform targeted by Genode
can be characterized by a set of such spec values.
The developer of a software component knows the constraints of his software and
thus specifies these requirements in the build-description file of the component. The

142
5.3 Build system

system integrator defines the platform the software will be built for by specifying the
targeted platform in the SPECS declaration in the build directory’s etc/specs.conf file. In
addition to the (optional) etc/specs.conf file within the build directory, the build system
incorporates all etc/specs.conf files found in the enabled repositories. For example, when
using the Linux kernel as a platform, the base-linux/etc/specs.conf file is picked up auto-
matically. The build directory’s specs.conf file can still be used to extend the SPECS
declarations, for example to enable special features.
Each <spec> in the SPECS variable instructs the build system to

• Include the make-rules of a corresponding base/mk/spec/<specname>.mk file. This


enables the customization of the build process for each platform.

• Search for <libname>.mk files in the lib/mk/spec/<specname>/ subdirectory. This


way, alternative implementations of one and the same library interface can be
selected depending on the platform specification.

Before a target or library gets built, the build system checks if the REQUIRES entries of
the build description file are satisfied by entries of the SPECS variable. The compilation
is executed only if each entry in the REQUIRES variable is present in the SPECS variable
as supplied by the build directory configuration.

5.3.5. Building tools to be executed on the host platform

Sometimes, software requires custom tools that are used to generate source code or
other ingredients for the build process, for example IDL compilers. Such tools won’t be
executed on top of Genode but on the host platform during the build process. Hence,
they must be compiled with the tool chain installed on the host, not the Genode tool
chain.
The build system accommodates the building of such host tools as a side effect of
building a library or a target. Even though it is possible to add the tool-compilation step
to a regular build description file, it is recommended to introduce a dedicated pseudo
library for building such tools. This way, the rules for building host tools are kept sepa-
rate from rules that refer to regular targets. By convention, the pseudo library should be
named <package>_host_tools and the host tools should be built at <build-dir>/tool/<pack-
age>/ where <package> refers to the name of the software package the tool belongs to,
e. g., qt5 or mupdf. To build a tool named <tool>, the pseudo library contains a custom
make rule like the following:

$(BUILD_BASE_DIR)/tool/<package>/<tool>:
$(MSG_BUILD)$(notdir $@)
$(VERBOSE)mkdir -p $(dir $@)
$(VERBOSE)...build commands...

143
5.3 Build system

To let the build system trigger the rule, add the custom target to the HOST_TOOLS
variable:

HOST_TOOLS += $(BUILD_BASE_DIR)/tool/<package>/<tool>

Once the pseudo library for building the host tools is in place, it can be referenced
by each target or library that relies on the respective tools via the LIBS declaration. The
tool can be invoked by referring to $(BUILD_BASE_DIR)/tool/<package>/tool.
For an example of using custom host tools, please refer to the mupdf package
found within the libports repository. During the build of the mupdf library, two
custom tools fontdump and cmapdump are invoked. The tools are built via the
lib/mk/mupdf_host_tools.mk library description file. The actual mupdf library (lib/mk/mupdf.mk)
has the pseudo library mupdf_host_tools listed in its LIBS declaration and refers to
the tools relative to $(BUILD_BASE_DIR).

5.3.6. Building 3rd-party software

The source code of 3rd-party software is managed by the mechanism presented in


Section 5.2. Once prepared, such source codes resides in a subdirectory of <genode-
dir>/contrib/.
If the build system encounters a target that incorporates ported source code (that is,
a build-description file that calls the select_from_ports function), it looks up the re-
spective <port-name>.hash file in the repositories as specified in the build configuration.
The fingerprint found in the hash file is used to construct the path to the port direc-
tory under contrib/. If that lookup fails, a meaningful error is printed. Any number of
versions of the same port can be installed at the same time. I.e., when switching Git
branches that use different versions of the same port, the build system automatically
finds the right port version as expected by the currently active branch.

144
5.4 System integration and automated testing

5.4. System integration and automated testing

Genode’s portability across kernels and hardware platforms is one of the prime fea-
tures of the framework. However, each kernel or hardware platform requires different
considerations when it comes to system configuration, integration, and booting. When
using a particular kernel, profound knowledge about the boot concept and the kernel-
specific tools is required. To streamline the testing of system scenarios across the many
different supported kernels and hardware platforms, the framework is equipped with
tools that relieve the system integrator from these peculiarities.

5.4.1. Run tool

The centerpiece of the system-integration infrastructure is the so-called run tool. Di-
rected by a script (run script), it performs all the steps necessary to test a system sce-
nario. Those steps are:

1. Building the components of a scenario

2. Configuration of the init component

3. Assembly of the boot directory

4. Creation of the boot image

5. Powering-on the test machine

6. Loading of the boot image

7. Capturing the log output

8. Validation of the scenario’s behavior

9. Powering-off the test machine

Each of those steps depends on various parameters such as the used kernel, the hard-
ware platform used to execute the scenario, the way the test hardware is connected to
the test infrastructure (e. g., UART, AMT, JTAG, network), the way the test hardware
is powered or reset, or the way of how the scenario is loaded into the test hardware.
To accommodate the variety of combinations of these parameters, the run tool consists
of an extensible library of modules. The selection and configuration of the modules is
expressed in the run-tool configuration. The following types of modules exist:

boot-dir modules These modules contain the functionality to populate the boot direc-
tory and are specific to each kernel. It is mandatory to always include the module
corresponding to the used kernel.
(the available modules are: linux, hw, okl4, fiasco, pistachio, nova, sel4, foc)

145
5.4 System integration and automated testing

image modules These modules are used to wrap up all components used by the run
script in a specific format and thereby prepare them for execution. Depending on
the used kernel, different formats can be used. With these modules, the creation
of ISO and disk images is also handled.
(the available modules are: uboot, disk, iso)

load modules These modules handle the way the components are transfered to the
target system. Depending on the used kernel there are various options to pass on
the components. For example, loading from TFTP or via JTAG is handled by the
modules of this category.
(the available modules are: tftp, jtag, fastboot, ipxe)

log modules These modules handle how the output of a currently executed run script
is captured.
(the available modules are: qemu, linux, serial, amt)

power_on modules These modules are used for bringing the target system into a
defined state, e. g., by starting or rebooting the system.
(the available modules are: qemu, linux, softreset, amt, netio)

power_off modules These modules are used for turning the target system off after
the execution of a run script.

Each module has the form of a script snippet located under the tool/run/<step>/ direc-
tory where <step> is a subdirectory named after the module type. Further instructions
about the use of each module (e. g., additional configuration arguments) can be found
in the form of comments inside the respective script snippets. Thanks to this modular
structure, an extension of the tool kit comes down to adding a file at the corresponding
module-type subdirectory. This way, custom work flows (such as tunneling JTAG over
SSH) can be accommodated fairly easily.

5.4.2. Run-tool configuration examples

To execute a run script, a combination of modules may be used. The combination is


controlled via the RUN_OPT declaration contained in the build directory’s etc/build.conf
file. The following examples illustrate the selection and configuration of different run
modules:

Executing NOVA in Qemu

RUN_OPT = --include boot_dir/nova \


--include power_on/qemu --include log/qemu --include image/iso

146
5.4 System integration and automated testing

By including boot_dir/nova, the run tool assembles a boot directory equipped with
a boot loader and a boot-loader configuration that is able to bootstrap the NOVA kernel.
The combination of the modules power_on/qemu and log/qemu prompts the run tool
to spawn the Qemu emulator with the generated boot image and fetch the log output
of the emulated machine from its virtual comport. The specification of image/iso tells
the run tool to use a bootable ISO image as a boot medium as opposed to a disk image.

Executing NOVA on a real x86 machine using AMT The following example uses
Intel’s advanced management technology (AMT) to remotely reset a physical target
machine (power_on/amt) and capture the serial output over network (log/amt). In
contrast to the example above, the system scenario is supplied via TFTP (load/tftp).
Note that the example requires a working network-boot setup including a TFTP server,
a DHCP server, and a PXE boot loader.

RUN_OPT = --include boot_dir/nova \


--include power_on/amt \
--power-on-amt-host 10.23.42.13 \
--power-on-amt-password ’foo!’ \
--include load/tftp \
--load-tftp-base-dir /var/lib/tftpboot \
--load-tftp-offset-dir /x86 \
--include log/amt \
--log-amt-host 10.23.42.13 \
--log-amt-password ’foo!’

If the test machine has a comport connection to the machine where the run tool is
executed, the log/serial module may be used instead of log/amt:

--include log/serial --log-serial-cmd ’picocom -b 115200 /dev/ttyUSB0’

Executing base-hw on a Raspberry Pi The following example boots a system sce-


nario based on the base-hw kernel on a Raspberry Pi that is powered via a network-
controllable power plug (netio). The Raspberry Pi is connected to a JTAG debugger,
which is used to load the system image onto the device.

147
5.4 System integration and automated testing

RUN_OPT = --include boot_dir/hw \


--include power_on/netio \
--power-on-netio-ip 10.23.42.5 \
--power-on-netio-user admin \
--power-on-netio-password secret \
--power-on-netio-port 1 \
--include power_off/netio \
--power-off-netio-ip 10.23.42.5 \
--power-off-netio-user admin \
--power-off-netio-password secret \
--power-off-netio-port 1 \
--include load/jtag \
--load-jtag-debugger \
/usr/share/openocd/scripts/interface/flyswatter2.cfg \
--load-jtag-board \
/usr/share/openocd/scripts/interface/raspberrypi.cfg \
--include log/serial \
--log-serial-cmd ’picocom -b 115200 /dev/ttyUSB0’

5.4.3. Meaningful default behaviour

The create_builddir tool introduced in Section 2.3 equips a freshly created build di-
rectory with a meaningful default configuration that depends on the selected platform
and the used kernel. For example, when creating a build directory for the x86_64 base
platform and building a scenario with KERNEL=linux, RUN_OPT is automatically defined
as

RUN_OPT = --include boot_dir/linux \


--include power_on/linux --include log/linux

5.4.4. Run scripts

Using run scripts, complete system scenarios can be described in a concise and kernel-
independent way. As described in Section 2.4, a run script can be used to integrate and
test-drive the scenario directly from the build directory. The best way to get acquainted
with the concept is by reviewing the run script for the hello-world example presented
in Section 2.5.4. It performs the following steps:

1. Building the components needed for the system using the build command. This
command instructs the build system to compile the targets listed in the brace
block. It has the same effect as manually invoking make with the specified argu-
ment from within the build directory.

148
5.4 System integration and automated testing

2. Creating a new boot directory using the create_boot_directory command.


The integration of the scenario is performed in a dedicated directory at <build-
dir>/var/run/<run-script-name>/. When the run script is finished, this boot direc-
tory will contain all components of the final system.

3. Installing the configuration for the init component into the boot directory using
the install_config command. The argument to this command will be written
to a file called config within the boot directory. It will eventually be loaded as
boot module and made available by core’s ROM service to the init component.
The configuration of init is explained in Chapter 6.

4. Creating a bootable system image using the build_boot_image command. This


command copies the specified list of files from the <build-dir>/bin/ directory to
the boot directory and executes the steps needed to transform the content of the
boot directory into a bootable form. In the most common case, the arguments of
build_boot_image correspond to the results of a prior build step. To avoid the
need to manually maintain the consistency between the arguments of both steps,
the build_artifacts function provides a handy way to express the common
case.

build_boot_image [build_artifacts]

Under the hood, the run tool invokes the run-module types boot_dir and boot_im-
age. Depending on the run-tool configuration, the resulting boot image may have
the form of an ISO image, a disk image, or a bootable ELF image.

5. Executing the system image using the run_genode_until command. Depending


on the run-tool configuration, the system image is executed using an emulator
or a physical machine. Under the hood, this step invokes the run modules of
the types power_on, load, log, and power_off. For most platforms, Qemu is used
by default. On Linux, the scenario is executed by starting core directly from the
boot directory. The run_genode_until command takes a regular expression as
argument. If the log output of the scenario matches the specified pattern, the
run_genode_until command returns. If specifying forever as argument, this
command will never return. If a regular expression is specified, an additional
argument determines a timeout in seconds. If the regular expression does not
match until the timeout is reached, the run script will abort.

After the successful completion of a run script, the run tool prints the message “Run
script execution successful.”.
Note that the hello.run script does not contain kernel-specific information. Therefore
it can be executed from the build directory of any base platform via the command make
run/hello KERNEL=<kernel> BOARD=<board>. When invoking make with an argu-
ment of the form run/<run-script>, the build system searches all repositories for a

149
5.4 System integration and automated testing

run script with the specified name. The run script must be located in one of the reposi-
tories’ run/ subdirectories and have the file extension .run.

5.4.5. The run mechanism explained

The run tool is based on expect, which is an extension of the Tcl scripting language
that allows for the scripting of interactive command-line-based programs. When the
user invokes a run script via make run/<run-script>, the build system invokes the run
tool at <genode-dir>/tool/run/run with the run script and the content of the RUN_OPT
definition as arguments. The run tool is an expect script that has no other purpose
than defining several commands used by run scripts and including the run modules
as specified by the run-tool configuration. Whereas tool/run/run provides the generic
commands, the run modules under tool/run/<module>/ contain all the peculiarities of
the various kernels and boot strategies. The run modules thereby document precisely
how the integration and boot concept works for each kernel platform.

Run modules Each module consist of an expect source file located in one of the exist-
ing directories of a category. It is named implicitly by its location and the name of the
source file, e. g. image/iso is the name of the image module that creates an ISO image.
The source file contains one mandatory function:

run_<module> { <module-args> }

The function is called if the step is executed by the run tool. If its execution was
successful, it returns true and otherwise false. Certain modules may also call exit on
failure.
A module may have arguments, which are - by convention - prefixed with the name
of the module, e. g., power_on/amt has an argument called -power-on-amt-host. By
convention, the modules contain accessor functions for argument values. For example,
the function power_on_amt_host in the run module power_on/amt returns the value
supplied to the argument -power-on-amt-host. Thereby, a run script can access the
value of such arguments in a defined way by calling power_on_amt_host. Also, ar-
guments without a value are treated similarly. For example, for querying the presence
of the argument -image-uboot-no-gzip, the run module run/image/uboot provides the
corresponding function image_uboot_use_no_gzip. In addition to these functions, a
module may have additional public functions. Those functions may be used by run
scripts or other modules. To enable a run script or module to query the presence of
another module, the run tool provides the function have_include. For example, the
presence of the load/tftp module can be checked by calling have_include with the ar-
gument “load/tftp”.

150
5.4 System integration and automated testing

5.4.6. Using run scripts to implement integration tests

Because run scripts are actually expect scripts, the whole arsenal of language features of
the Tcl scripting language is available to them. This turns run scripts into powerful tools
for the automated execution of test cases. A good example is the run script at repos/lib-
ports/run/lwip.run, which tests the lwIP stack by running a simple Genode-based HTTP
server on the test machine. It fetches and validates a HTML page from this server. The
run script makes use of a regular expression as argument to the run_genode_until
command to detect the state when the web server becomes ready, subsequently ex-
ecutes the lynx shell command to fetch the web site, and employs Tcl’s support for
regular expressions to validate the result. The run script works across all platforms that
have network support. To accommodate a high diversity of platforms, parts of the run
script depend on the spec values as defined for the build directory. The spec values
are probed via the have_spec function. Depending on the probed spec values, the run
script uses the append_if and lappend_if commands to conditionally assemble the
init configuration and the list of boot modules.
To use the run mechanism efficiently, a basic understanding of the Tcl scripting lan-
guage is required. Furthermore the functions provided by tool/run/run and the run
modules at tool/run/ should be studied.

5.4.7. Automated testing across base platforms

To execute one or multiple test cases on more than one base platform, there exists a
dedicated tool at tool/autopilot. Its primary purpose is the nightly execution of test cases.
The tool takes a list of platforms and of run scripts as arguments and executes each run
script on each platform. A platform is a triplet of CPU architecture, board, and kernel.
For example, the following command instructs autopilot to generate a build directory
for the x86_64 architecture and to execute the log.run script for the kernels board-kernel
combinations NOVA on a PC and seL4 on a PC.

autopilot -t x86_64-pc-sel4 -t x86_64-pc-nova -r log

The build directory for each architecture is created at /tmp/autopilot.<username>/<ar-


chitecture> and the output of each run script is written to a file called <architec-
ture>.<board>.<kernel>.<run-script>.log. On stderr, autopilot prints the statistics about
whether or not each run script executed successfully on each platform. If at least one
run script failed, autopilot returns a non-zero exit code, which makes it straight forward
to include autopilot into an automated build-and-test environment.

151
5.5 Package management

5.5. Package management

The established system-integration work flow with Genode is based on the run tool as
explained in the previous section. It automates the building, configuration, integration,
and testing of Genode-based systems. Whereas the run tool succeeds in overcoming
the challenges that come with Genode’s diversity of kernels and supported hardware
platforms, its scalability is somewhat limited to appliance-like system scenarios: The
result of the integration process is a system image with a certain feature set. Whenever
requirements change, the system image is replaced with a freshly created image that
takes those requirements into account. In practice, there are two limitations of this
system-integration approach:
First, since the run tool implicitly builds all components required for a system sce-
nario, the system integrator has to compile all components from source. For example, if
a system includes a component based on Qt5, one needs to compile the entire Qt5 appli-
cation framework, which induces significant overhead to the actual system-integration
tasks of composing and configuring components.
Second, general-purpose systems tend to become too complex and diverse to be
treated as system images. When looking at commodity OSes, each installation differs
with respect to the installed set of applications, user preferences, used device drivers
and system preferences. A system based on the run tool’s work flow would require the
user to customize the run script of the system for each tweak. To stay up to date, the
user would need to re-create the system image from time to time while manually main-
taining any customizations. In practice this is a burden very few end users are willing
to endure.
The primary goal of Genode’s package management is to overcome these scalability
limitations, in particular:
• Alleviating the need to build everything that goes into system scenarios from
scratch,
• Facilitating modular system compositions while abstracting from technical de-
tails,
• On-target system update and system development,
• Assuring the user that system updates are safe to apply by providing the ability
to easily roll back the system or parts thereof to previous versions,
• Securing the integrity of the deployed software,
• Low friction for existing developers.
The design of Genode’s package-management concept is largely influenced by Git as
1
well as the Nix package manager. In particular the latter opened our eyes to discover
1
https://nixos.org/nix/

152
5.5 Package management

the potential that lies beyond the package management employed in state-of-the art
commodity systems. Even though we considered adapting Nix for Genode and actu-
ally conducted intensive experiments in this direction, we settled on a custom solution
that leverages Genode’s holistic view on all levels of the operating system including
the build system and tooling, source structure, ABI design, framework API, system
configuration, inter-component interaction, and the components itself. Whereby Nix
is designed for being used on top of Linux, Genode’s whole-systems view led us to
simplifications that eliminated the needs for Nix’ powerful features like its custom de-
scription language.

5.5.1. Nomenclature

When speaking about “package management”, one has to clarify what a “package” in
the context of an operating system represents. Traditionally, a package is the unit of
delivery of a bunch of “dumb” files, usually wrapped up in a compressed archive. A
package may depend on the presence of other packages. Thereby, a dependency graph
is formed. To express how packages fit with each other, a package is usually accom-
panied with meta data (description). Depending on the package manager, package de-
scriptions follow certain formalisms (e. g., package-description language) and express
more-or-less complex concepts such as versioning schemes or the distinction between
hard and soft dependencies.
Genode’s package management does not follow this notion of a “package”. Instead
of subsuming all deliverable content under one term, we distinguish different kinds of
content, each in a tailored and simple form. To avoid the clash of the notions of the
common meaning of a “package”, we speak of “archives” as the basic unit of delivery.
The following subsections introduce the different categories. Archives are named with
their version as suffix, appended via a slash. The suffix is maintained by the author of
the archive. The recommended naming scheme is the use of the release date as version
suffix, e. g., report_rom/2017-05-14.

Raw-data archive A raw-data archive contains arbitrary data that is - in contrast to


executable binaries - independent from the processor architecture. Examples are con-
figuration data, game assets, images, or fonts. The content of raw-data archives is ex-
pected to be consumed by components at runtime. It is not relevant for the build pro-
cess of executable binaries. Each raw-data archive contains merely a collection of data
files. There is no meta data.

API archive An API archive has the structure of a Genode source-code repository. It
may contain all the typical content of such a source-code repository such as header files
(in the include/ subdirectory), source codes (in the src/ subdirectory), library-description
files (in the lib/mk/ subdirectory), or ABI symbols (lib/symbols/ subdirectory). At the top

153
5.5 Package management

level, a LICENSE file is expected that clarifies the license of the contained source code.
There is no meta data contained in an API archive.
An API archive is meant to provide ingredients for building components. The canon-
ical example is the public programming interface of a library (header files) and the
library’s binary interface in the form of an ABI-symbols file. One API archive may con-
tain the interfaces of multiple libraries. For example, the interfaces of libc and libm
may be contained in a single “libc” API archive because they are closely related to each
other. Conversely, an API archive may contain a single header file only. The granularity
of those archives may vary. But they have in common that they are used at build time
only, not at runtime.

Source archive Like an API archive, a source archive has the structure of a Genode
source-tree repository and is expected to contain all the typical content of such a source
repository along with a LICENSE file. But unlike an API archive, it contains descrip-
tions of actual build targets in the form of Genode’s usual target.mk files.
In addition to the source code, a source archive contains a file called used_apis,
which contains a list of API-archive names with each name on a separate line. For
example, the used_apis file of the report_rom source archive looks as follows:

base/2017-05-14
os/2017-05-13
report_session/2017-05-13

The used_apis file declares the APIs needed to incorporate into the build process
when building the source archive. Hence, they represent build-time dependencies on the
specific API versions.
A source archive may be equipped with a top-level file called api containing the
name of exactly one API archive. If present, it declares that the source archive imple-
ments the specified API. For example, the libc/2017-05-14 source archive contains
the actual source code of the libc and libm as well as an api file with the content
libc/2017-04-13. The latter refers to the API implemented by this version of the libc
source package (note the differing versions of the API and source archives)

Binary archive A binary archive contains the build result of the equally-named
source archive when built for a particular architecture. That is, all files that would ap-
pear in the <build-dir>/bin/ subdirectory when building all targets present in the source
archive. There is no meta data present in a binary archive.
A binary archive is created out of the content of its corresponding source archive and
all API archives listed in the source archive’s used_apis file. Note that since a binary
archive depends on only one source archive, which has no further dependencies, all
binary archives can be built independently from each other. For example, a libc-using

154
5.5 Package management

application needs the source code of the application as well as the libc’s API archive
(the libc’s header file and ABI) but it does not need the actual libc library to be present.

Package archive A package archive contains an archives file with a list of archive
names that belong together at runtime. Each listed archive appears on a separate
line. For example, the archives file of the package archive for the window manager
wm/2018-02-26 looks as follows:

genodelabs/raw/wm/2018-02-14
genodelabs/src/wm/2018-02-26
genodelabs/src/report_rom/2018-02-26
genodelabs/src/decorator/2018-02-26
genodelabs/src/floating_window_layouter/2018-02-26

In contrast to the list of used_apis of a source archive, the content of the archives
file denotes the origin of the respective archives (“genodelabs”), the archive type, fol-
lowed by the versioned name of the archive.
An archives file may specify raw archives, source archives, or package archives (as
type pkg). It thereby allows the expression of _runtime dependencies_. If a package
archive lists another package archive, it inherits the content of the listed archive. This
way, a new package archive may easily customize an existing package archive.
A package archive does not specify binary archives directly as they differ between
the architecture and are already referenced by the source archives.
In addition to an archives file, a package archive is expected to contain a README file
explaining the purpose of the collection.

5.5.2. Depot structure

Archives are stored within a directory tree called depot/. The depot is structured as
follows:

<user>/src/<name>/<version>/
<user>/api/<name>/<version>/
<user>/raw/<name>/<version>/
<user>/pkg/<name>/<version>/
<user>/bin/<arch>/<src-name>/<src-version>/

The <user> stands for the origin of the contained archives. For example, the offi-
cial archives provided by Genode Labs reside in a genodelabs/ subdirectory. Subsum-
ing archives in a subdirectory that correspond to their origin (user) serves two pur-
poses. First, it provides a user-local name space for versioning archives. E.g., there

155
5.5 Package management

might be two versions of a nitpicker/2017-04-15 source archive, one by “genode-


labs” and one by “nfeske”. However, since each version resides in its origin’s sub-
directory, version-naming conflicts between different origins cannot happen. Second,
by allowing multiple archive origins in the depot side-by-side, package archives may
incorporate archives of different origins, which fosters the goal of a federalistic devel-
opment, where contributions of different origins can be easily combined.
The actual archives are stored in the subdirectories named after the archive types
(raw, api, src, bin, pkg). Archives contained in the bin/ subdirectories are further
subdivided in the various architectures (like x86_64, or arm_v7a).

5.5.3. Depot management

The tools for managing the depot content reside under the tool/depot/ directory. When
invoked without arguments, each tool prints a brief description of the tool and its ar-
guments.
Unless stated otherwise, the tools are able to consume any number of archives as
arguments. By default, they perform their work sequentially. This can be changed
by the -j<N> argument, where <N> denotes the desired level of parallelization. For
example, by specifying -j4 to the tool/depot/build tool, four concurrent jobs are executed
during the creation of binary archives.

Downloading archives The depot can be populated with archives in two ways, either
by creating the content from locally available source codes as explained by Section 5.5.4,
or by downloading ready-to-use archives from a web server.
In order to download archives originating from a specific user, the download tool
expects user-specific information to be defined at <repo>/sculpt/depot/<user>/ where
<repo> can be any subdirectory under <genode-dir>/repos/. For reference, the informa-
tion for the official “genodelabs” depot user is located at gems/sculpt/depot/genodelabs/.

<user>/pubkey
<user>/download

pubkey contains the public key of the GPG key pair used by the creator (aka “user”)
of the to-be-downloaded archives for signing the archives. The file contains the
ASCII-armored version of the public key.

download contains the base URL of the web server where to fetch archives from. The
web server is expected to mirror the structure of the depot. That is, the base
URL is followed by a sub directory for the user, which contains the archive-type-
specific subdirectories.

If both the public key and the download locations are defined, the download tool can
be used as follows:

156
5.5 Package management

./tool/depot/download genodelabs/src/zlib/2022-02-27

The tool automatically downloads the specified archives and their dependencies. For
example, as the zlib depends on the libc API, the libc API archive is downloaded as well.
All archive types are accepted as arguments including binary and package archives.
Furthermore, it is possible to download all binary archives referenced by a package
archive. For example, the following command downloads the window-manager (wm)
package archive, including all binary archives, for the 64-bit x86 architecture. Down-
loaded binary archives are always accompanied with their corresponding source and
used API archives.

./tool/depot/download genodelabs/pkg/x86_64/wm/2022-04-12

Archive content is not downloaded directly to the depot. Instead, the individual
archives and signature files are downloaded to a quarantine area in the form of a public/
directory located in the root of Genode’s source tree. As its name suggests, the pub-
lic/ directory contains data that is imported from or to-be exported to the public. The
download tool populates it with the downloaded archives in their compressed form
accompanied with their signatures.
The compressed archives are not extracted before their signature is checked against
the public key defined at depot/<user>/pubkey. If however the signature is valid, the
archive content is imported to the target destination within the depot. This procedure
ensures that depot content - whenever downloaded - is blessed by the cryptographic
signature of its creator.

Building binary archives from source archives With the depot populated with
source and API archives, one can use the tool/depot/build tool to produce binary archives.
The arguments have the form <user>/bin/<arch>/<src-name> where <arch>
stands for the targeted CPU architecture. For example, the following command builds
the zlib library for the 64-bit x86 architecture. It executes four concurrent jobs during
the build process.

./tool/depot/build genodelabs/bin/x86_64/zlib/2022-02-27 -j4

Note that the command expects a specific version of the source archive as argument.
The depot may contain several versions. So the user has to decide, which one to build.
After the tool is finished, the freshly built binary archive can be found in the depot
within the genodelabs/bin/<arch>/<src>/<version>/ subdirectory. Only the final result of
the built process is preserved. In the example above, that would be the zlib.lib.so library.

157
5.5 Package management

For debugging purposes, it might be interesting to inspect the intermediate state of


the build. This is possible by adding KEEP_BUILD_DIR=1 as argument to the build
command. The binary’s intermediate build directory can be found besides the binary
archive’s location named with a .build suffix.
By default, the build tool won’t attempt to rebuild a binary archive that is already
present in the depot. However, it is possible to force a rebuild via the REBUILD=1 argu-
ment.

Publishing archives Archives located in the depot can be conveniently made avail-
able to the public using the tool/depot/publish tool. Given an archive path, the tool takes
care of determining all archives that are implicitly needed by the specified one, wrap-
ping the archive’s content into compressed tar archives, and signing those.
As a precondition, the tool requires you to possess the private key that matches the
sculpt/depot/<you>/pubkey file as found within one of the available repositories. The key
pair should be present in the key ring of your GNU privacy guard.
To publish archives, one needs to provide the specific version to publish. For exam-
ple:

./tool/depot/publish <you>/pkg/x86_64/wm/2022-04-12

To accommodate the common case of publishing the current version of the source
tree, there exists the following shortcut:

./tool/depot/publish_current <you>/pkg/x86_64/wm

The command checks that the specified archive and all dependencies are present in
the depot. It then proceeds with the archiving and signing operations. For the latter,
the pass phrase for your private key will be requested. The publish tool outputs the
information about the processed archives, e. g.:

158
5.5 Package management

publish /.../public/<you>/api/framebuffer_session/2020-06-28.tar.xz
publish /.../public/<you>/api/gems/2022-02-14.tar.xz
publish /.../public/<you>/api/gui_session/2020-06-28.tar.xz
publish /.../public/<you>/api/input_session/2022-02-14.tar.xz
publish /.../public/<you>/api/os/2022-04-12.tar.xz
publish /.../public/<you>/api/report_session/2020-03-25.tar.xz
publish /.../public/<you>/api/sandbox/2021-06-24.tar.xz
publish /.../public/<you>/api/timer_session/2021-04-19.tar.xz
publish /.../public/<you>/bin/x86_64/init/2022-04-12.tar.xz
publish /.../public/<you>/bin/x86_64/report_rom/2022-04-12.tar.xz
publish /.../public/<you>/bin/x86_64/wm/2022-04-12.tar.xz
publish /.../public/<you>/pkg/wm/2022-04-12.tar.xz
publish /.../public/<you>/raw/wm/2020-06-21.tar.xz
publish /.../public/<you>/src/init/2022-04-12.tar.xz
publish /.../public/<you>/src/report_rom/2022-04-12.tar.xz
publish /.../public/<you>/src/wm/2022-04-12.tar.xz

According to the output, the tool populates a directory called public/ at the root of
the Genode source tree with the to-be-published archives. The content of the public/
directory is now ready to be copied to a web server, e. g., by using rsync.

5.5.4. Automated extraction of archives from the source tree

Genode users are expected to populate their local depot with content obtained via
the tool/depot/download tool. However, Genode developers need a way to create depot
archives locally in order to make them available to users. Thanks to the tool/depot/extract
tool, the assembly of archives does not need to be a manual process. Instead, archives
can be conveniently generated out of the source codes present in the Genode source
tree and the contrib/ directory.
However, the granularity of splitting source code into archives, the definition of what
a particular API entails, and the relationship between archives must be augmented by
the archive creator as this kind of information is not present in the source tree as is.
This is where so-called “archive recipes” enter the picture. An archive recipe defines
the content of an archive. Such recipes can be located at an recipes/ subdirectory of any
source-code repository, similar to how port descriptions and run scripts are organized.
Each recipe/ directory contains subdirectories for the archive types, which, in turn, con-
tain a directory for each archive. The latter is called a recipe directory.

Recipe directory The recipe directory is named after the archive omitting the archive
version and contains at least one file named hash. This file defines the version of the
archive along with a hash value of the archive’s content separated by a space character.
By tying the version name to a particular hash value, the extract tool is able to detect the

159
5.5 Package management

appropriate points in time whenever the version should be increased due to a change
of the archive’s content.

API, source, and raw-data archive recipes Recipe directories for API, source, or
raw-data archives contain a content.mk file that defines the archive’s content in the form
of make rules. The content.mk file is executed from the archive’s location within the
depot. Hence, the contained rules can refer to archive-relative files as targets. The first
(default) rule of the content.mk file is executed with a customized make environment:

GENODE_DIR A variable that holds the path to the root of the Genode source tree,

REP_DIR A variable with the path to the source code repository where the recipe is
located

port_dir A make function that returns the directory of a port within the contrib/ di-
rectory. The function expects the location of the corresponding port file as argu-
ment, for example, the zlib recipe residing in the libports/ repository may specify
$(REP_DIR)/ports/zlib to access the 3rd-party zlib source code.

Source archive recipes contain simplified versions of the used_apis and (for libraries)
api files as found in the archives. In contrast to the depot’s counterparts of these files,
which contain version-suffixed names, the files contained in recipe directories omit the
version suffix. This is possible because the extract tool always extracts the current ver-
sion of a given archive from the source tree. This current version is already defined in
the corresponding recipe directory.

Package-archive recipes The recipe directory for a package archive contains the ver-
batim content of the to-be-created package archive except for the archives file. All other
files are copied verbatim to the archive. The content of the recipe’s archives file may
omit the version information from the listed ingredients. Furthermore, the user part of
each entry can be left blank by using _ as a wildcard. When generating the package
archive from the recipe, the extract tool will replace this wildcard with the user that
creates the archive.

5.5.5. Convenience front-end to the extract, build tools

For developers, the work flow of interacting with the depot is most often the combi-
nation of the extract and build tools whereas the latter expects concrete version names
as arguments. The create tool accelerates this common usage pattern by allowing the
user to omit the version names. Operations implicitly refer to the current version of the
archives as defined in the recipes.
Furthermore, the create tool is able to manage version updates for the developer. If
invoked with the argument UPDATE_VERSIONS=1, it automatically updates hash files

160
5.5 Package management

of the involved recipes by taking the current date as version name. This is a valu-
able assistance in situations where a commonly used API changes. In this case, the
versions of the API and all dependent archives must be increased, which would be a
labour-intensive task otherwise. If the depot already contains an archive of the current
version, the create tools won’t re-create the depot archive by default. Local modifica-
tions of the source code in the repository do not automatically result in a new archive.
To ensure that the depot archive is current, one can specify FORCE=1 when executing
the create tool. With this argument, existing depot archives are replaced by freshly
extracted ones and version updates are detected. When specified for binary archives,
FORCE=1 normally implies REBUILD=1. To prevent the superfluous rebuild of binary
archives whose source versions remain unchanged, FORCE=1 can be combined with the
argument REBUILD=.

5.5.6. Accessing depot content from run scripts

The depot tools are not meant to replace the run tool but rather to complement it. When
both tools are combined, the run tool implicitly refers to “current” archive versions as
defined for the archive’s corresponding recipes. This way, the regular run-tool work
flow can be maintained while attaining a productivity boost by fetching content from
the depot instead of building it.
Run scripts can use the import_from_depot function to incorporate archive content
from the depot into a scenario. It must be called after the create_boot_directory
function and takes any number of pkg, src, or raw archives as arguments. An archive
is specified as depot-relative path of the form <user>/<type>/name. Run scripts may
call import_from_depot repeatedly. Each argument can refer to a specific version of
an archive or just the version-less archive name. In the latter case, the current version
(as defined by a corresponding archive recipe in the source tree) is used.
If a src archive is specified, the run tool integrates the content of the corresponding
binary archive into the scenario. The binary archives are selected according the spec
values as defined for the build directory.
The following excerpt of a run script incorporates the content of several binary
archives into a system scenario. The base_src function is provided by the run tool and
returns the name of an archive with the kernel-specific ingredients. It depends on the
KERNEL and BOARD definition in the build directory.

import_from_depot [depot_user]/src/[base_src] \
[depot_user]/src/report_rom \
[depot_user]/src/fs_rom \
[depot_user]/src/vfs \
[depot_user]/src/init

161
5.5 Package management

The depot_user function returns the name of the depot sub directory from where
the archives should be obtained. It returns “genodelabs” by default. This default can be
overridden via the -depot-user argument of the run tool. For example, the following
line in the <build-dir>/etc/build.conf file instructs the import_from_depot call above to
obtain the depot content from depot/test/.

RUN_OPT += --depot-user test

Automated depot management When using the import_from_depot mechanism of


the run tool, one frequently encounters a situation where the depot lacks a particular
archive. Whenever the run tool detects such a situation, it prompts the user to man-
ually curate the depot content via the tool/depot/create tool. The need for such manual
steps negatively interferes with the development workflow. The right manual steps
are sometimes not straight-forward to find, in particular after switching between Git
branches.
To relieve the developer from this uncreative manual labor, the run tool provides the
option -depot-auto-update for managing the depot automatically according to the
needs of the executed run script. To enable this option, use the following line in the
build configuration:

RUN_OPT += --depot-auto-update

If enabled, the run tool automatically invokes the right depot-management com-
mands to populate the depot with the required archives, and to ensure the consistency
of the depot content with the current version of the source tree. The feature comes at the
price of a delay when executing the run script because the consistency check involves
the extraction of all used source archives from the source tree. In regular run scripts,
this delay is barely noticeable. Only when working with a run script of a large system,
it may be better to leave the depot auto update disabled.
Please note that the use of the automated depot update may result in version updates
of the corresponding depot recipes in the source tree (recipe hash files). It is a good
practice to review and commit those hash files once the local changes in the source tree
have reached a good shape.

Selectively overriding depot content While working on a component that is em-


bedded in a complex system scenario, the advantages of the run-tool’s work flow and
the depot can easily be combined. The majority of the scenario’s content may come
from the depot via the import_from_depot mechanism. Because fetching content from
the depot sidesteps the build system for those components, the system integration step
becomes very quick. It is still possible to override selected components by freshly built

162
5.5 Package management

ones. For example, while working on the graphical terminal component, one may com-
bine the following lines in one run script:

create_boot_directory
...
import_from_depot genodelabs/pkg/terminal
...
build { server/terminal }
build_boot_image { terminal }

Since, the pkg/terminal package is imported from the depot, the scenario obtains all
ingredients needed to spawn a graphical terminal such as font and configuration data.
The package also contains the terminal binary. However, as we want to use our freshly
compiled binary instead, we override the terminal with our customized version by
specifying the binary name in the build_boot_image step.
The same approach is convenient for instrumenting low-level parts of the frame-
work while debugging a larger scenario. As the low-level parts reside within the dy-
namic linker, we can explicitly build the dynamic linker lib/ld and integrate the resulting
ld.lib.so binary as boot module:

create_boot_directory
...
import_from_depot genodelabs/src/[base_src]
...
build { lib/ld }
build_boot_image { ld.lib.so }

163
5.6 Static code analysis

5.6. Static code analysis

The Clang static analyzer tool can analyze source code in C and C++ projects to find
bugs at compile time:

Clang static analyzer https://clang-analyzer.llvm.org

With this tool enabled, Genode users can check and ensure the quality of Genode com-
ponents. It can be invoked during make invocations and during the creation of pack-
ages.
For the invocation of make within a Genode build directory, the STATIC_ANALYZE
variable on the command line prompts the static analyzer to run next to the actual build
step.

STATIC_ANALYZE=1 make -C build/x86_64 KERNEL=... run/...

For analyzing packages, the wrapper tool tool/depot/static_analyze becomes handy. It


can be combined with the tool/depot/* tools to take effect:

tool/depot/static_analyze tool/depot/create <user>/pkg/...

The results of the static-analyzer tool are generated in the form of HTML pages and
can be inspected afterwards. The following example output showcases a run of the
static analyzer tool:

make: Entering directory ’../genode/build/x86_64’


checking library dependencies...
scan-build: Using ’/usr/lib/llvm-6.0/bin/clang’ for static analysis
...

LINK init
scan-build: 0 bugs found.
scan-build: The analyzer encountered problems on some source files.
scan-build: Preprocessed versions of these sources were deposited in
’/tmp/scan-build-2018-11-28-111203-20081-1/failures’.

This feature is known to work well with Clang 6.0 on Ubuntu 16.04. The steps to
provide the required tools on Linux are like follows.

sudo apt install clang-tools-6.0


cd $HOME/bin
ln -s $(which scan-build-6.0) scan-build

164
5.7 Git flow

5.7. Git flow

The official Genode Git repository is available at the project’s GitHub site:

GitHub project
https://github.com/genodelabs/genode

5.7.1. Master and staging

The official Git repository has two branches “master” and “staging”.

Master branch The master branch is the recommended branch for users of the frame-
work. It is known to have passed quality tests. The existing history of this branch is
fixed and will never change.

Staging branch The staging branch contains the commits that are scheduled for in-
clusion into the master branch. However, before changes are merged into the master
branch, they are subjected to quality-assurance measures conducted by Genode Labs.
Those measures include the successful building of the framework for all base plat-
forms and the passing of automated tests. After changes enter the staging branch, those
quality-assurance measures are expected to fail. If so, the changes are successively re-
fined by a series of fixup commits. Each fixup commit should refer to the commit it is
refining using a commit message as follows:

fixup "<commit message of the refined commit>"

If the fixup is non-trivial, change the “fixup” prefix to “squash” and add a more
elaborative description to the commit message.
Once the staging branch passes the quality-assurance measures, the Genode main-
tainers tidy-up the history of the staging branch by merging all fixup commits with
their respective original commit. The resulting commits are then merged on top of the
master branch and the staging branch is reset to the new master branch.
Note that the staging branch is volatile. In contrast to the master branch, its history
is not stable. Hence, it should not be used to base developments on.

Release version The version number of a Genode release refers to the release date.
The two-digit major number corresponds to the last two digits of the year and the two-
digit minor number corresponds to the month. For example, “17.02”.
Each Genode release represents a snapshot of the master branch taken at release time.
It is complemented by the following commits:

165
5.7 Git flow

• “Release notes for version <version>" containing the release documentation in


the form of a text file at doc/release_notes,

• “News item for Genode <version>" containing the release announcement as pub-
lished at the genode.org website,

• “Version: <version>" with the adaptation of the VERSION file.

The latter commit is tagged with the version number. The tag is signed by one of the
mainline developers.

5.7.2. Development practice

Each developer maintains a fork of Genode’s Git repository. To facilitate close collab-
oration with the developer community, it is recommended to host the fork on GitHub.
Open a GitHub account, use GitHub’s web interface to create a new fork, and follow
the steps given by GitHub to fetch the cloned repository to your development machine.
In the following, we refer to the official Genode repository as “genodelabs/genode”.
To conveniently follow the project’s mainline development, it is recommended to reg-
ister the official repository as a “remote” in your Git repository:

git remote add genodelabs https://github.com/genodelabs/genode.git

Once, the official repository is known to your clone, you can fetch new official revi-
sions via

git fetch genodelabs

Topic branches As a rule of thumb, every line of development has a corresponding


topic in the issue tracker. This is the place where the developers discuss and review the
ongoing work. Hence, when starting a new line of development, the first step should
be the creation of a new topic.

Issue tracker
https://github.com/genodelabs/genode/issues

The new topic should be accompanied with a short description about the motivation
behind the line of work and the taken approach. The second step is the creation of a
dedicated topic branch in the developer’s fork of Genode’s Git repository.

git checkout -b issue<number> genodelabs/master

166
5.7 Git flow

The new topic branch should be based on the most current genodelabs/master branch.
This eases the later integration of the topic branch into the mainline development.
While working on a topic branch, it is recommended to commit many small interme-
diate steps. This is useful to keep track of the line of thoughts during development. This
history is regarded as volatile. That is, it is not set in stone. Hence, you as developer do
not have to spend too much thoughts on the commits during the actual development.
Once the work on the topic is completed and the topic branch is going to get inte-
grated into the mainline development, the developer curates the topic-branch history so
that a short and well-arranged sequence of commits remains. This step is usually per-
formed by interactively editing the topic-branch history via the git rebase -i com-
mand. In many cases, the entire topic branch can be squashed into a single commit. The
goal behind this curating step is to let the mainline history document the progress at a
level of detail that is meaningful for the users of the framework. The mainline history
should satisfy the following:

• The relationship of a commit with an issue at the issue tracker should be visible.
For this reason, GitHub’s annotations “Issue #n” and “Fixed #n” are added to the
commit messages.

• Revisiting the history between Genode releases should clearly reveal the changes
that potentially interest the users. I.e., when writing the quarterly release notes,
the Genode developers go through the history and base the release-notes docu-
mentation on the information contained in the commit messages. This works best
if each topic is comprised by a few commits with meaningful descriptions. This
becomes hard if the history contains too many details.

• Each commit should represent a kind of “transaction” that can be reviewed in-
dependently without knowing too much context. This is hardly possible if inter-
mediate steps that subsequently touch the same code are present as individual
commits.

• It should be easy to selectively revert individual topics/features using git revert


(e. g., when trouble-shooting). This is simple when each topic is represented by
one or just a few commits.

Coding conventions Genode’s source code follows time-tested conventions regard-


ing the coding style and code pattern, which are important to follow. The coding style
is described in the following document:

Coding-style Guidelines
https://genode.org/documentation/developer-resources/coding_style

167
5.7 Git flow

Writing a commit message Commit messages should adhere the following conven-
tion. The first line summarizes the commit using not more than 50 characters. This line
will be displayed by various tools. So it should express the basic topic and eventually
refer to an issue. For example:

Add sanity checks in tool/tool_chain, fix #62

If the patch refers to an existing issue, add a reference to the corresponding issue. If
not, please consider opening an issue first. In the case the patch is supposed to close an
existing issue, add this information using GitHub’s conventions, e. g., by stating “Fix
#45” in your commit message, the issue will be closed automatically, by stating “Issue
#45”, the commit will be displayed in the stream of discussion of the corresponding
issue.
After a blank line, a description of the patch follows. The description should consider
the following questions:

• Why is the patch needed?

• How does the patch achieve the goal?

• What are known consequences of this patch? Will it break API compatibility, or
produce a follow-up issue?

Reconsider the documentation related to your patch: If the commit message contains
important information not present in the source code, this information should bet-
ter be placed into the code or the accompanied documentation (e. g., in the form of
a README file).

168
6. System configuration

There are manifold principal approaches to configure different aspects of an operating


system and the applications running on top. At the lowest level, there exists the oppor-
tunity to pass configuration information to the boot loader. This information may be
evaluated directly by the boot loader or passed to the booted system. As an example
for the former, some boot loaders allow for setting up a graphics mode depending on
its configuration. Hence, the graphics mode to be used by the OS could be defined right
at this early stage of booting. More prominently, however, is the mere passing of con-
figuration information to the booted OS, e. g., in the form of a kernel command line or
as command-line arguments to boot modules. The OS interprets boot-loader-provided
data structures (i. e., multiboot info structures) to obtain such information. Most ker-
nels interpret certain configuration arguments passed via this mechanism. At the OS-
initialization level, before any drivers are functioning, the OS behavior is typically gov-
erned by configuration information provided along with the kernel image, i. e., an ini-
tial file-system image (initrd). On Linux-based systems, this information comes in the
form of configuration files and init scripts located at well-known locations within the
initial file-system image. Higher up the software stack, configuration becomes an even
more diverse topic. I.e., the runtime behavior of a GNU/Linux-based system is defined
by a conglomerate of configuration files, daemons and their respective command-line
arguments, environment variables, collections of symlinks, and plenty of heuristics.
The diversity and complexity of configuration mechanisms, however, is problematic
for high-assurance computing. To attain a high level of assurance, Genode’s architec-
ture must be complemented by a low-complexity yet scalable configuration concept.
The design of this concept takes the following considerations into account:

Uniformity across platforms To be applicable across a variety of kernels and hard-


ware platforms, the configuration mechanism must not rely on a particular kernel
or boot loader. Even though boot loaders for x86-based machines usually support
the multiboot specification and thereby the ability to supplement boot modules
with additional command lines, boot loaders on ARM-based platforms generally
lack this ability. Furthermore, even if a multiboot compliant boot loader is used,
the kernel - once started - must provide a way to reflect the boot information to
the system on top, which is not the case for most microkernels.

Low complexity The configuration mechanism is an intrinsic part of each component.


Hence, it affects the trusted computing base of every Genode-based system. For
this reason, the mechanism must be easy to understand and implementable with-
out the need for complex underlying OS infrastructure. As a negative example,
the provision of configuration files via a file system would require each Genode-
based system to support the notion of a file system and to define the naming of
configuration files.

169
Expressiveness Passing configuration information as command-line arguments to
components at their creation time seems like a natural way to avoid the complex-
ity of a file-based configuration mechanism. However, whereas command-line
arguments are the tried and tested way for supplying program arguments in a
concise way, the expressiveness of the approach is limited. In particular, it is
ill-suited for expressing structured information as often found in configurations.
Being a component-based system, Genode requires a way to express relationships
between components, which lends itself to the use of a structural representation.

Common syntax The requirement of a low-complexity mechanism mandates a com-


mon syntax across components. Otherwise, each component would need to come
with a custom parser. Each of those parsers would eventually inflate the com-
plexity of the trusted computing base. In contrast, a common syntax that is both
expressive and simple to parse helps to avoid such redundancies by using a single
parser implementation across all components.

Least privilege Being the guiding motive behind Genode’s architecture, the principle
of least privilege needs to be applied to the access of configuration information.
Each component needs to be able to access its own configuration but must not
observe configuration information concerning unrelated components. A system-
global registry of configurations or even a global namespace of keys for such a
database would violate this principle.

Accommodation of dynamic workloads Supplying configuration information at the


construction time of a component is not sufficient for long-living components,
whose behavior might need to be adapted at runtime. For example, the assign-
ment of resources to the clients of a resource multiplexer might change over the
lifetime of the resource multiplexer. Hence, the configuration concept should pro-
vide a means to update the configuration information of a component after it has
been constructed.

170
6.1 Nested configuration concept

<config>
<parent-provides> ... </parent-provides>
<default-route> ... </default-route>
...
<start name="nitpicker" caps="100">
...
</start>
<start name="launchpad" caps="2000">
...
<config>
<launcher name="Virtualbox">
<binary name="init"/>
<config>
<parent-provides> ... </parent-provides>
<default-route>
<any-service> <any-child/> <parent/> </any-service>
</default-route>
<start name="virtualbox" caps="1000">
<resource name="RAM" quantum="1G"/>
<config vbox_file="test.vbox" vm_name="TestVM">
...
</config>
</start>
</config>
</launcher>
</config>
</start>
</config>

Figure 43: Nested system configuration

6.1. Nested configuration concept

Genode’s configuration concept is based on the ROM session interface described in Sec-
tion 4.5.1. In contrast to a file-system interface, the ROM session interface is extremely
simple. The client of a ROM service specifies the requested ROM module by its name as
known by the client. There is neither a way to query a list of available ROM modules,
nor are ROM modules organized in a hierarchic name space.
The ROM session interface is implemented by core’s ROM service to make boot mod-
ules available to other components. Those boot modules comprise the executable bina-
ries of the init component as well as those of the components created by init. Further-
more, a ROM module called “config” contains the configuration of the init process in
XML format. To obtain its configuration, init requests a ROM session for the ROM
module “config” from its parent, which is core. Figure 43 shows an example of such a
config ROM module.
The config ROM module uses XML as syntax, which supports the expression of arbi-
trary structural data while being simple to parse. I.e., Genode’s XML parser comes in
the form of a single header file with less than 600 lines of code. Init’s configuration is
contained within a single <config> node.

171
6.1 Nested configuration concept

Init

request
"config" response

Launcher

request
"config" response

Init

request response
"config"

Core

Figure 44: Successive interception of “config” ROM requests

Each component started by init obtains its configuration by requesting a ROM mod-
ule named “config” from its parent. Init responds to this request by handing out a
locally-provided ROM session. Instead of handing out the “config” ROM module as
obtained from core, it creates a new dataspace that solely contains the portion of init’s
config ROM module that refers to the respective child. Analogously to init’s configu-
ration, each child’s configuration has the form of a single <config> node. This works
recursively. From each component’s perspective, including the init component, the
mechanism for obtaining its configuration is identical – it obtains a ROM session for
a ROM module named “config” from its parent. The parent interposes the ROM ses-
sion request as described in Section 4.7.3. Figure 44 shows the successive interposing
of “config” ROM requests according to the example configuration given in Figure 43.
At each level, the information structure within the <config> node can be different. Be-
sides following the convention that a configuration has the form of a single <config>
node, each component can introduce arbitrary custom tags and attributes.
Besides being simple, the use of the ROM session interface for supplying configu-
ration information has the benefit of supporting dynamic configuration updates over
the lifetime of the config ROM session. Section 4.5.1 describes the update protocol be-
tween client and server of a ROM session. This way, the configuration of long-living
components can be dynamically changed.

172
6.2 The init component

6.2. The init component

The init component plays a special role within Genode’s component tree. It gets started
directly by core, gets assigned all physical resources, and controls the execution of all
subsequent component nodes, which can be further instances of init. Init’s policy is
driven by an XML-based configuration, which declares a number of children, their re-
lationships, and resource assignments. The XML schema definition for init’s configura-
tion is provided at repos/os/src/init/config.xsd.

6.2.1. Session routing

At the parent-child interface, there are two operations that are subject to policy deci-
sions of the parent: the child announcing a service and the child requesting a service.
If a child announces a service, it is up to the parent to decide if and how to make this
service accessible to its other children. When a child requests a service, the parent
may deny the session request, delegate the request to its own parent, implement the
requested service locally, or open a session at one of its other children. This decision
may depend on the service requested or the session-construction arguments provided
by the child. Apart from assigning resources to children, the central element of the pol-
icy implemented in the parent is a set of rules to route session requests. Therefore, init’s
configuration concept is laid out around child components and the routing of session
requests originating from those components. The mechanism is best illustrated by an
example:

173
6.2 The init component

<config>
<parent-provides>
<service name="PD"/>
<service name="ROM"/>
<service name="CPU"/>
<service name="LOG"/>
</parent-provides>
<start name="timer" caps="100">
<resource name="RAM" quantum="1M"/>
<provides> <service name="Timer"/> </provides>
<route>
<service name="PD"> <parent/> </service>
<service name="ROM"> <parent/> </service>
<service name="CPU"> <parent/> </service>
<service name="LOG"> <parent/> </service>
</route>
</start>
<start name="test-timer" caps="200">
<resource name="RAM" quantum="1M"/>
<route>
<service name="Timer"> <child name="timer"/> </service>
<service name="PD"> <parent/> </service>
<service name="ROM"> <parent/> </service>
<service name="CPU"> <parent/> </service>
<service name="LOG"> <parent/> </service>
</route>
</start>
</config>

First, there is the declaration of services provided by the parent of the configured
init instance. In this case, we declare that the parent provides a a LOG service. For
each child to start, there is a <start> node describing the assigned RAM and capa-
bility budget, declaring services provided by the child, and holding a routing table for
session requests originating from the child. The first child is called “timer” and imple-
ments the “Timer” service. The second component called “test-timer” is a client of the
timer service. In its routing table, we see that requests for “Timer” sessions are routed
to the “timer” child whereas requests for core’s services are routed to init’s parent.
Per-child service routing rules provide a flexible way to express arbitrary client-server
relationships. For example, service requests may be transparently mediated through
special policy components acting upon session-construction arguments. There might
be multiple children implementing the same service, each targeted by different routing
tables. If there exists no valid route to a requested service, the service is denied. In the
example above, the routing tables act effectively as a white list of services the child is
allowed to use.

174
6.2 The init component

Routing based on session labels Access-control policies in Genode systems are


based on session labels. When a server receives a new session request, the session label
is passed along with the request.
A session label is a string that is assembled by the components that are involved with
routing the session request from the client along the branches of the component tree to
the server. The client may specify the least significant part of the label by itself. This
part gives the parent a hint for routing the request. For example, a client may create two
file-system sessions, one labeled with “home” and one labeled with “bin”. The parent
may take this information into account and route the individual requests to different
file-system servers. The label is successively superseded (prefixed) by additional parts
along the chain of components on the route of the session request. The first part of the
label is the most significant part as it is imposed by the component in the intermediate
proximity of the server. The last part is the least trusted part of the label because it
originated from the client. Once the session request arrives at the server, the server
takes the session label as the key to select a server-side policy as described in Section
4.6.2.
In most cases, routing decisions are simply based on the type of the requested ses-
sions. However, by equipping <service> nodes with the following attributes, it is
possible to take session labels as a criterion for the routing of session requests into ac-
count.

label="<string>" The session label must perfectly match the specified string.

label_prefix="<string>" The first part of the label must match the specified string.

label_suffix="<string>" The end of the label must match the specified string.

unscoped_label="<string>" The session label including the child’s name prefix


must perfectly match the specified string. In contrast to the label attribute,
which refers to the child-defined label, the unscoped_label can refer to the
child’s environment sessions, which have no client-defined label because they are
initiated by init itself.

label_last="<string>" The part after the last "→" delimiter must match the speci-
fied string. This part usually refers to a requested resource such as the name of a
ROM module. If no delimiter is present, the label must be an exact match.

If no attributes are present, the route matches. The attributes can be combined. If any of
the specified attributes mismatch, the route is neglected. If multiple <service> nodes
match in init’s routing configuration, the first matching rule is taken. So the order of
the nodes is important.

175
6.2 The init component

Wildcards In practice, usage scenarios become more complex than the basic example,
increasing the size of routing tables. Furthermore, in many practical cases, multiple
children may use the same set of services and require duplicated routing tables within
the configuration. In particular during development, the elaborative specification of
routing tables tend to become an inconvenience. To alleviate this problem, there are
two mechanisms, namely wildcards and a default route. Instead of specifying a list of
individual service routes targeting the same destination, the wildcard <any-service>
becomes handy. For example, instead of specifying

<route>
<service name="ROM"> <parent/> </service>
<service name="LOG"> <parent/> </service>
<service name="PD"> <parent/> </service>
<service name="CPU"> <parent/> </service>
</route>

the following shortform can be used:

<route>
<any-service> <parent/> </any-service>
</route>

The latter version is not as strict as the first one because it permits the child to create
sessions at the parent, which were not white listed in the elaborative version. Therefore,
the use of wildcards is discouraged for configuring untrusted components. Wildcards
and explicit routes may be combined as illustrated by the following example:

<route>
<service name="LOG"> <child name="nitlog"/> </service>
<any-service> <parent/> </any-service>
</route>

The routing table is processed starting with the first entry. If the route matches the
service request, it is taken, otherwise the remaining routing-table entries are visited.
This way, the explicit service route of “LOG” sessions to the “nitlog” child shadows the
LOG service provided by the parent.
To allow a child to use services provided by arbitrary other children, there is a further
wildcard called <any-child>. Using this wildcard, such a policy can be expressed as
follows:

<route>
<any-service> <parent/> </any-service>
<any-service> <any-child/> </any-service>
</route>

176
6.2 The init component

This rule would delegate all session requests referring to one of the parent’s services to
the parent. If no parent service matches the session request, the request is routed to any
child providing the service. The rule can be further abbreviated to:

<route>
<any-service> <parent/> <any-child/> </any-service>
</route>

Init detects potential ambiguities caused by multiple children providing the same ser-
vice. In this case, the ambiguity must be resolved using an explicit route preceding the
wildcards.

Default routing To reduce the need to specify the same routing table for many chil-
dren in one configuration, there is a <default-route> mechanism. The default route is
declared within the <config> node and used for each <start> entry with no <route>
node. In particular during development, the default route becomes handy to keep the
configuration tidy and neat.
The combination of explicit routes and wildcards is designed to scale well from be-
ing convenient to use during development towards being highly secure at deployment
time. If only explicit rules are present in the configuration, the permitted relationships
between all processes are explicitly defined and can be easily verified.

Using aliases for component names In complex scenarios, it is sometimes useful


to distinguish the name of a component from its role as a service provider for other
components. For example, file-system server may be called “ahci-1.ext2.fs” to express
that the component works with an EXT2 file system on partition 1 of an AHCI block
device. Such an encoding is useful to clearly tell multiple file-system servers apart. But
for configuring the routes of multiple file-system clients, the repetition of this level of
detail becomes a burden. This situation calls for an indirection:

<alias name="wwwdata_fs" child="ahci-1.ext2.fs"/>

An <alias> node introduces a new name that can be used as session-routing target
alternatively to the name of the referred-to <start> node. With the alias in place, a
client’s session route can now be written such that the purpose of the targeted server
becomes clear at first glance.

...
<service name="File_system" label="wwwdata">
<child name="wwwdata_fs"/> </service>
...

177
6.2 The init component

The alias not only eases the human interpretation of routing rules during a security
assessment. It also anticipates a flexible adaptation of the scenario for the use of another
file system, which would involve only the change of the single alias while keeping all
client-side session routes unaffected.

6.2.2. Resource assignment

Physical memory budget Each <start> node must be equipped with a declaration
of the amount of RAM assigned to the child via a <resource> sub node.

<resource name="RAM" quantum="1M"/>

If the specified amount exceeds the available resources, the available resources are
assigned almost completely to the child. This makes it possible to assign all remaining
resources to the last child by simply specifying an overly large quantum. Note that this
special case should only be used for static configurations. In dynamic scenarios, com-
ponents can temporarily disappear during the reconfiguration, which temporary frees
their resources. Once released, however, those temporarily slack resources would end
up in the unsaturated subsystem, preventing the subsequent (re-)spawning of other
subsystems.
Init retains only a small amount of quota for itself, which is used to cover indirect
costs such as a few capabilities created on behalf of the children, or memory used for
buffering configuration data. The preserved amount can be configured as follows:

<config>
...
<resource name="RAM" preserve="1M"/>
...
</config>

Capability budget Each component requires a certain amount of capabilities to live.


At startup, several capabilities are created along with the component’s environment
sessions, in particular its PD session. At lifetime, the component consumes capabili-
ties when creating signal handlers or RPC objects. Since the system-global amount of
capabilities is a bounded resource, which depends on the used kernel and the kernel
configuration, Genode subjects the allocation of capabilities to the same rigid regime as
for physical memory. First, the creation of capabilities is restricted by resource quotas
explicitly assigned to components. Second, capability budgets can be traded between
clients and servers such that servers are able to account capability allocations to their
clients.

178
6.2 The init component

Each <start> node can be equipped with a caps attribute with the amount of capa-
bilities assigned to the component. As a rule of thumb, the setup costs of a component
are 35 capabilities. Hence, for typical components, an amount of 100 is a practical value.
To alleviate the need to equip each <start> node with the same default value, the init
configuration accepts a default declaration as follows:

<default caps="100"/>

Unless a <start> node is equipped with a custom caps attribute, the default value is
used.
If a component runs out of capabilities, core’s PD service prints a warning to the log.
To observe the consumption of capabilities per component in detail, core’s PD service
is equipped with a diagnostic mode, which can be enabled via the diag attribute in the
target node of init’s routing rules. E.g., the following route enables the diagnostic mode
for the PD session:

<route>
<service name="PD"> <parent diag="yes"/> </service>
...
</route>

With the diag attribute enabled, core prints a log message each time the PD con-
sumes, frees, or transfers its capability budget.

6.2.3. Multiple instantiation of a single ELF binary

Each <start> node requires a unique name attribute. By default, the value of this
attribute is used as ROM module name for obtaining the ELF binary from the parent.
If multiple instances of a component with the same ELF binary are needed, the binary
name can be explicitly specified using a <binary> sub node of the <start> node:

<binary name="filename"/>

This way, a unique child name can be defined independently from the binary name.

6.2.4. Session-label rewriting

As explained in section 6.2.1, init routes session requests by taking the requested service
type and the session label into account. The latter may be used by the server as a key for
selecting a policy at the server side. To simplify server-side policies, init supports the
rewriting of session labels in the target node of a matching session route. For example,
an interactive shell (“shell”) may have the following session route for the “home” file
system:

179
6.2 The init component

<route>
<service name="File_system" label="home">
<child name="vfs"/>
</service>
...
</route>

At the “vfs” file-system server, the label of the file-system session will appear as
“shell → home”. This information may be evaluated by the vfs’s server-side policy.
However, when renaming the shell instance, we’d need to update this server-side pol-
icy.
With the label-rewriting mechanism, the client’s identity can be hidden from the
server. The label can instead represent the role of the client, or a name of a physical
resource. For example, the route could be changed to this:

<route>
<service name="File_system" label="home">
<child name="vfs" label="primary_user"/>
</service>
...
</route>

Whenever the vfs receives the session request, it is presented with the label “pri-
mary_user”. The fact that the client is “shell” is not taken into account for the server-
side policy selection.

6.2.5. Nested configuration

Each <start> node can host a <config> sub node. As described in Section 6.1, the
content of this sub node is provided to the child when a ROM session for the module
name “config” is requested. Thereby, arbitrary configuration parameters can be passed
to the child. For example, the following configuration starts the timer-test within an
init instance within another init instance. To show the flexibility of init’s service routing
facility, the “Timer” session of the second-level timer-test child is routed to the timer
service started at the first-level init instance.

180
6.2 The init component

<config>
<parent-provides>
<service name="LOG"/>
<service name="ROM"/>
<service name="CPU"/>
<service name="PD"/>
</parent-provides>
<start name="timer" caps="100">
<resource name="RAM" quantum="1M"/>
<provides><service name="Timer"/></provides>
<route>
<any-service> <parent/> </any-service>
</route>
</start>
<start name="init" caps="1000">
<resource name="RAM" quantum="10M"/>
<config>
<parent-provides>
<service name="Timer"/>
<service name="LOG"/>
<service name="ROM"/>
<service name="CPU"/>
<service name="PD"/>
</parent-provides>
<start name="test-timer" caps="200">
<resource name="RAM" quantum="1M"/>
<route>
<any-service> <parent/> </any-service>
</route>
</start>
</config>
<route>
<service name="Timer"> <child name="timer"/> </service>
<any-service> <parent/> </any-service>
</route>
</start>
</config>

The services ROM, LOG, CPU, and PD are required by the second-level init instance
to create the timer-test component. As illustrated by this example, the use of nested
configurations enables the construction of arbitrarily complex component trees via a
single configuration.

181
6.2 The init component

Init

Init

Init

Init

Figure 45: Successive virtualization of CPU affinity spaces by nested instances of init

6.2.6. Configuring components from distinct ROM modules

As an alternative to specifying the component configurations of all <start> nodes via


<config> sub nodes, component configurations may be placed in separate ROM mod-
ules by facilitating the session-label rewriting mechanism described in Section 6.2.4:

<start name="nitpicker">
<resource name="RAM" quantum="1M"/>
<route>
<service name="ROM" label="config">
<parent label="nitpicker.config"/>
</service>
...
</route>
...
</start>

With this routing rule in place, a ROM session request for the module “config” is
routed to the parent and appears at the parent’s ROM service under the label “nit-
picker.config”.

6.2.7. Assigning subsystems to CPUs

Most multi-processor (MP) systems have topologies that can be represented on a two-
dimensional coordinate system. CPU nodes close to each other are expected to have
closer relationship than distant nodes. In a large MP system, it is natural to assign clus-
ters of closely related nodes to a given workload. As described in Section 3.2, Genode’s
architecture is based on a strictly hierarchic organizational structure. Thereby, it lends
itself to the idea of applying this successive virtualization of resources to the problem
of clustering CPU nodes.
Each component within the component tree has a component-local view on a so-
called affinity space, which is a two-dimensional coordinate space. If the component

182
6.2 The init component

creates a new subsystem, it can assign a portion of its own affinity space to the new
subsystem by imposing a rectangular affinity location to the subsystem’s CPU session.
Figure 45 illustrates the idea.
Following from the expression of affinities as a rectangular location within a component-
local affinity space, the assignment of subsystems to CPU nodes consists of two parts:
the definition of the affinity space dimensions as used for the init instance, and the as-
sociation of subsystems with affinity locations relative to the affinity space. The affinity
space is configured as a sub node of the <config> node. For example, the following
declaration describes an affinity space of 4x2:

<config>
...
<affinity-space width="4" height="2" />
...
</config>

Subsystems can be constrained to parts of the affinity space using the <affinity>
sub node of a <start> entry:

<config>
...
<start name="loader">
<affinity xpos="0" ypos="1" width="2" height="1" />
...
</start>
...
</config>

As illustrated by this example, the numbers used in the declarations for this instance
of init are not directly related to physical CPUs. If the machine has merely two cores,
init’s affinity space would be mapped to the range 0,1 of physical CPUs. However, in a
machine with 16x16 CPUs, the loader would obtain 8x8 CPUs with the upper-left CPU
at position (0,8).

6.2.8. Priority support

The number of CPU priorities to be distinguished by init can be specified with the
prio_levels attribute of the <config> node. The value must be a power of two. By
default, no priorities are used. To assign a priority to a child process, a priority value can
be specified as priority attribute of the corresponding <start> node. Valid priority
values lie in the range of -prio_levels + 1 (maximum priority degradation) to 0 (no
priority degradation).

183
6.2 The init component

6.2.9. Propagation of exit events

A component can notify its parent about its graceful exit via the exit RPC function of the
parent interface. By default, init responds to such a notification from one of its children
by merely printing a log message but ignores it otherwise. However, there are scenarios
where the exit of a particular child should result in the exit of the entire init component.
To propagate the exit of a child to the parent of init, start nodes can host the optional
sub node <exit> with the attribute propagate set to “yes”.

<config>
<start name="shell">
<exit propagate="yes"/>
...
</start>
</config>

The exit value specified by the exiting child is forwarded to init’s parent.

6.2.10. State reporting

When used in a nested fashion, init can be configured to report its internal state in
the form of a “state” report by placing a <report> node into init’s configuration. The
report node accepts the following arguments (with their default values shown):

delay_ms=“100” specifies the number of milliseconds to wait before producing a new


report. This way, many consecutive state changes - like they occur during startup
- do not result in an overly large number of reports but are merged into one final
report.

buffer=“4K” the maximum size of the report in bytes. The attribute accepts the use of
K/M/G as units.

init_ram=“no” if enabled, the report will contain a <ram> node with the memory
statistics of init.

init_caps=“no” if enabled, the report will contain a <caps> node with the capability-
allocation statistics of init.

ids=“no” supplement the children in the report with unique IDs, which may be used
to infer the lifetime of children across configuration updates in the future.

requested=“no” if enabled, the report will contain information about all session re-
quests initiated by the children.

184
6.2 The init component

provided=“no” if enabled, the report will contain information about all sessions pro-
vided by all servers.

session_args=“no” level of detail of the session information generated via requested


or provided.

child_ram=“no” if enabled, the report will contain a <ram> node for each child based
on the information obtained from the child’s PD session.

child_caps=“no” if enabled, the report will contain a <caps> node for each child
based on the information obtained from the child’s PD session.

Note that the state reporting feature cannot be used for the initial instance of init started
by core. It depends on the “Timer” and “Report” services, which are provided by
higher-level components only.

6.2.11. Init verbosity

To ease debugging, init can be instructed to print diverse status information as LOG
output. To enable the verbose mode, assign the value “yes” to the verbose attribute of
the <config> node.

6.2.12. Service forwarding

In nested scenarios, init is able to act as a server that forwards session requests to its
children. Session requests can be routed depending on the requested service type and
the session label originating from init’s parent.
The feature is configured by one or multiple <service> nodes hosted in init’s
<config> node. The routing policy is selected via the regular server-side policy-
selection mechanism, for example:

<config>
...
<service name="LOG">
<policy label="shell">
<child name="terminal_log" label="important"/>
</policy>
<default-policy> <child name="nitlog"/> </default-policy>
</service>
...
</config>

Each policy node must have a <child> sub node, which denotes the name of the
server with the name attribute. The optional label attribute defines the session label

185
6.2 The init component

Component Component Component

I’m How are I’m How are


fine
How are
you?
you? fine you? ?
Init

Figure 46: Init queries the responsiveness of its children

presented to the server, analogous to how the rewriting of session labels works in ses-
sion routes. If not specified, the client-provided label is presented to the server as is.

6.2.13. Component health monitoring

Scenarios where components are known to sometimes fail call for a mechanism that
continuously checks the health of components and reports anomalies. To accommo-
date such use cases, Genode provides a built-in health-monitoring mechanism. Each
component registers a heartbeat signal handler during its initialization at its parent.
This happens under the hood and is thereby transparent to application code. Each
time the signal is triggered by the parent, the default heartbeat handler invokes the
heartbeat_response function at the parent interface and thereby confirms that the
component is still able to respond to external events.
Thanks to this low-level mechanism, the init component is able to monitor its child
components as depicted in Figure 46. A global (for this init instance) heartbeat rate
can be configured via a <heartbeat rate_ms=“1000”/> node at the top level of init’s
configuration. The heartbeat rate can be specified in milliseconds. If configured, init
uses a dedicated timer session for performing health checks periodically. Each com-
ponent that hosts a <heartbeat> node inside its <start> node is monitored. In each
period, init requests heartbeat responses from all monitored children and maintains a
count of outstanding heartbeats for each component. The counter is incremented in
each period and reset whenever the child responds to init’s heartbeat request. When-
ever the number of outstanding heartbeats of a child becomes higher than 1, the child
may be in trouble. Init reports this information in its state report via the new attribute
skipped_heartbeats=“N” where N denotes the number of periods since the child be-
came unresponsive.
Of course, the mechanism won’t deliver 100% accuracy. There may be situations like
long-running calculations where long times of unresponsiveness are expected from a
healthy component. Vice versa, in a multi-threaded application, the crash of a sec-
ondary thread may go undetected if the primary (checked) thread stays responsive.

186
6.2 The init component

However, in the majority of cases where a component crashes (page fault, stack over-
flow), gets stuck in a busy loop, produces a deadlock, or throws an unhandled excep-
tion (abort), the mechanism nicely reflects the troublesome situation to the outside.

187
7. Under the hood

This chapter gives insight into the inner workings of the Genode OS framework. In
particular, it explains how the concepts explained in Chapter 3 are realized on different
kernels and hardware platforms.

188
7.1 Component-local startup code and linker scripts

7.1. Component-local startup code and linker scripts

All Genode components including core rely on the same startup code, which is roughly
outlined at the end of Section 3.5. This section revisits the required steps in more detail
and refers to the corresponding points in the source code. Furthermore, it provides
background information about the linkage of components, which is closely related to
the startup code.

7.1.1. Linker scripts

Under the hood, the Genode build system uses three different linker scripts located at
repos/base/src/ld/:

genode.ld is used for statically linked components, including core,

genode_dyn.ld is used for dynamically linked components, i. e., components that are
linked against at least one shared library,

genode_rel.ld is used for shared libraries.

Additionally, there exists a special linker script for the dynamic linker (Section 7.6).
Each program image generated by the linker generally consists of three parts, which
appear consecutively in the component’s virtual memory.

1. A read-only “text” part contains sections for code, read-only data, and the list of
global constructors and destructors.
The startup code is placed in a dedicated section .text.crt0, which appears
right at the start of the segment. Thereby the link address of the component is
known to correspond to the ELF entrypoint (the first instruction of the assembly
startup code). This is useful when converting the ELF image of the base-hw ver-
sion of core into a raw binary. Such a raw binary can be loaded directly into the
memory of the target platform without the need for an ELF loader.
The mechanisms for generating the list of constructors and destructors differ be-
tween CPU architecture and are defined by the architecture’s ABI. On x86, the
lists are represented by .ctors.* and .dtors.*. On ARM, the information about
global constructors is represented by .init_array and there is no visible infor-
mation about global destructors.

2. A read-writeable “data” part that is pre-populated with data.

3. A read-writeable “bss” part that is not physically present in the binary but known
to be zero-initialized when the ELF image is loaded.

189
7.1 Component-local startup code and linker scripts

The link address is not defined in the linker script but specified as linker argument.
The default link address is specified in a platform-specific spec file, e. g., repos/base-
nova/mk/spec/nova.mk for the NOVA platform. Components that need to organize their
virtual address space in a special way (e. g., a virtual machine monitor that co-locates
the guest-physical address space with its virtual address space) may specify link ad-
dresses that differ from the default one by overriding the LD_TEXT_ADDR value.

ELF entry point As defined at the start of the linker script via the ENTRY directive,
the ELF entrypoint is the function _start. This function is located at the very beginning
of the .text.crt0 section. See the Section 7.1.2 for more details.

Symbols defined by the linker script The following symbols are defined by the
linker script and used by the base framework.

_prog_img_beg, _prog_img_data, _prog_img_end Those symbols mark the start


of the “text” part, the start of the “data” part (the end of the “text” part), and the
end of the “bss” part. They are used by core to exclude those virtual memory
ranges from the core’s virtual-memory allocator (core-region allocator).
_parent_cap, _parent_cap_thread_id, _parent_cap_local_name Those sym-
bols are located at the beginning of the “data” part. During the ELF loading of a
new component, the parent writes information about the parent capability to this
location (the start of the first read-writeable ELF segment). See the corresponding
code in the Loaded_executable constructor in base/src/lib/base/child_process.cc.
The use of the information depends on the base platform. E.g., on a platform
where a capability is represented by a tuple of a global thread ID and an object ID
such as OKL4 and L4ka::Pistachio, the information is taken as verbatim values.
On platforms that fully support capability-based security without the use of any
form of a global name to represent a capability, the information remains unused.
Here, the parent capability is represented by the same known local name in all
components.

Even though the linker scripts are used across all base platforms, they contain a few
platform-specific supplements that are needed to support the respective kernel ABIs.
For example, the definition of the symbol __l4sys_invoke_indirect is needed only
on the Fiasco.OC platform and is unused on the other base platforms. Please refer to
the comments in the linker script for further explanations.

7.1.2. Startup code

The execution of the initial thread of a new component starts at the ELF entry point,
which corresponds to the _start function. This is an assembly function defined in

190
7.1 Component-local startup code and linker scripts

repos/base/src/lib/startup/spec/<arch>/crt0.s where <arch> is the CPU architecture (x86_32,


x86_64, or ARM).

Assembly startup code The assembly startup code is position-independent code


(PIC). Because the Genode base libraries are linked against both statically-linked and
dynamically linked executables, they have to be compiled as PIC code. To be consistent
with the base libraries, the startup code needs to be position-independent, too.
The code performs the following steps:

1. Saving the initial state of certain CPU registers. Depending on the used kernel,
these registers carry information from the kernel to the core component. More
details about this information are provided by Section 7.3.2. The initial register
values are saved in global variables named _initial_<register>. The global
variables are located in the BSS segment. Note that those variables are used solely
by core.

2. Setting up the initial stack. Before the assembly code can call any higher-level
C function, the stack pointer must be initialized to point to the top of a valid
stack. The initial stack is located in the BSS section and referred to by the symbol
_stack_high. However, having a stack located within the BSS section is danger-
ous. If it overflows (e. g., by declaring large local variables, or by recursive func-
tion calls), the stack would silently overwrite parts of the BSS and DATA sections
located below the lower stack boundary. For prior known code, the stack can be
dimensioned to a reasonable size. But for arbitrary application code, no assump-
tion about the stack usage can be made. For this reason, the initial stack cannot
be used for the entire lifetime of the component. Before any component-specific
code is called, the stack needs to be relocated to another area of the virtual address
space where the lower bound of the stack is guarded by empty pages. When us-
ing such a “real” stack, a stack overflow will produce a page fault, which can be
handled or at least immediately detected. The initial stack is solely used to per-
form the steps required to set up the real stack. Because those steps are the same
for all components, the usage of the initial stack is bounded.

3. Because the startup code is used by statically linked components as well as the
dynamic linker, the startup immediately calls the init_rtld hook function. For
regular components, the function does not do anything. The default implemen-
tation in init_main_thread.cc at repos/base/src/lib/startup/ is a weak function. The
dynamic linker provides a non-weak implementation, which allows the linker to
perform initial relocations of itself very early at the dynamic linker’s startup.

4. By calling the init_main_thread function defined in repos/base/src/lib/startup/init_main_thread.cc,


the assembly code triggers the execution of all the steps needed for the creation

191
7.1 Component-local startup code and linker scripts

of the real stack. The function is implemented in C++, uses the initial stack, and
returns the address of the real stack.

5. With the new stack pointer returned by init_main_thread, the assembly startup
code is able to switch the stack pointer from the initial stack to the real stack. From
this point on, stack overflows cannot easily corrupt any data.

6. With the real stack in place, the assembly code finally passes the control over to
the C++ startup code provided by the _main function.

Initialization of the real stack along with the Genode environment As mentioned
above, the assembly code calls the init_main_thread function (located in repos/base/s-
rc/lib/startup/init_main_thread.cc) for setting up the real stack for the program. For plac-
ing a stack in a dedicated portion of the component’s virtual address space, the function
needs to overcome two principle problems:

• It needs to obtain the backing store used for the stack, i. e., allocating a dataspace
from the component’s PD session as initialized by the parent.

• It needs to preserve a portion of its virtual address space for placing the stack and
make the allocated memory visible within this portion.

In order to solve both problems, the function needs to obtain the capability for its PD
session from its parent. This comes down to the need to perform RPC calls. First,
for requesting the PD session capability from the parent, and second, for invoking the
session capability to perform the RAM allocation and region-map attach operations.
The RPC mechanism is based on C++. In particular, the mechanism supports the
propagation of C++ exceptions across RPC interfaces. Hence, before being able to per-
form RPC calls, the program must initialize the C++ runtime including the exception-
handling support. The initialization of the C++ runtime, in turn, requires support for
dynamically allocating memory. Hence, a heap must be available. This chain of depen-
dencies ultimately results in the need to construct the entire Genode environment as a
side effect of initializing the real stack of the program.
During the construction of the Genode environment, the program requests its own
CPU, PD, and LOG sessions from its parent.
With the environment constructed, the program is able to interact with its own PD
session and can principally realize the initialization of the real stack. However, instead
of merely allocating a new RAM dataspace and attaching the dataspace to the address
space of the PD session, a so-called stack area is used. The stack area is a secondary re-
gion map that is attached as a dataspace to the component’s address-space region map.
This way, virtual-memory allocations within the stack area can be managed manually.
I.e., the spaces between the stacks of different threads are guaranteed to remain free
from any attached dataspaces. The stack area of a component is created as part of the

192
7.1 Component-local startup code and linker scripts

component’s PD session. The environment initialization code requests its region-map


capability via Pd_session::stack_area and attaches it as a managed dataspace to the
component’s address space.

Component-dependent startup code With the Genode environment constructed


and the initial stack switched to a proper stack located in the stack area, the component-
dependent startup code of the Genode::bootstrap_component function in repos/base/s-
rc/lib/startup/_main.cc can be executed, which, in turn, initializes the component’s de-
fault entrypoint before calling the application-level entry function Component::construct.

193
7.2 C++ runtime

7.2. C++ runtime

Genode is implemented in C++ and relies on all C++ features required to use the lan-
guage in its idiomatic way. This includes the use of exceptions and runtime-type infor-
mation.

7.2.1. Rationale behind using exceptions

Compared to return-based error handling as prominently used in C programs, the C++


exception mechanism is much more complex. In particular, it requires the use of a
C++ runtime library that is called as a back-end by the exception handling code and
generated by the compiler. This library contains the functionality needed to unwind
the stack and a mechanism for obtaining runtime type information (RTTI). The C++
runtime libraries that come with common tool chains, in turn, rely on a C library for
performing dynamic memory allocations, string operations, and I/O operations. Con-
sequently, C++ programs that rely on exceptions and RTTI implicitly depend on a C
library. For this reason, the use of those C++ features is universally disregarded for
low-level operating-system code that usually does not run in an environment where a
complete C library is available.
In principle, C++ can be used without exceptions and RTTI (by passing the argu-
ments -fno-exceptions and -fno-rtti to GCC). However, without those features, it
is hardly possible to use the language as designed.
For example, when the operator new is used, it performs two steps: Allocating the
memory needed to hold the to-be-created object and calling the constructor of the object
with the return value of the allocation as this pointer. In the event that the memory
allocation fails, the only way for the allocator to propagate the out-of-memory condition
is throwing an exception. If such an exception is not thrown, the constructor would be
called with a null as this pointer.
Another example is the handling of errors during the construction of an object. The
object construction may consist of several consecutive steps such as the construction of
base classes and aggregated objects. If one of those steps fails, the construction of the
overall object remains incomplete. This condition must be propagated to the code that
issued the object construction. There are two principle approaches:

1. The error condition can be kept as an attribute in the object. After constructing
the object, the user of the object may detect the error condition by requesting the
attribute value. However, this approach is plagued by the following problems.
First, the failure of one step may cause subsequent steps to fail as well. In the
worst case, if the failed step initializes a pointer that is passed to subsequent steps,
the subsequent steps may use an uninitialized pointer. Consequently, the error
condition must eventually be propagated to subsequent steps, which, in turn,
need to be implemented in a defensive way.

194
7.2 C++ runtime

Second, if the construction failed, the object exists but it is inconsistent. In the
worst case, if the user of the object misses to check for the successful construc-
tion, it will perform operations on an inconsistent object. But even in the good
case, where the user detects the incomplete construction and decides to immedi-
ately destruct the object, the destruction is error prone. The already performed
steps may have had side effects such as resource allocations. So it is important
to revert all the successful steps by invoking their respective destructors. How-
ever, when destructing the object, the destructors of the incomplete steps are also
called. Consequently, such destructors need to be implemented in a defensive
manner to accommodate this situation.
Third, objects cannot have references that depend on potentially failing construc-
tion steps. In contrast to a pointer that may be marked as uninitialized by being a
null pointer, a reference is, by definition, initialized once it exists. Consequently,
the result of such a step can never be passed as reference to subsequent steps.
Pointers must be used.
Fourth, the mere existence of incompletely constructed objects introduces many
variants of possible failures that need to be considered in the code. There may be
many different stages of incompleteness. Because of the third problem, every time
a construction step takes the result of a previous step as an argument, it explicitly
has to consider the error case. This, in turn, tremendously inflates the test space
of the code.
Furthermore, there needs to be a convention of how the completion of an object
is indicated. All programmers have to learn and follow the convention.

2. The error condition triggers an exception. Thereby, the object construction im-
mediately stops at the erroneous step. Subsequent steps are not executed at all.
Furthermore, while unwinding the stack, the exception mechanism reverts all al-
ready completed steps by calling their respective destructors. Consequently, the
construction of an object can be considered as a transaction. If it succeeds, the
object is known to be completely constructed. If it fails, the object immediately
ceases to exist.

Thanks to the transactional semantics of the second variant, the state space for poten-
tial error conditions (and thereby the test space) remains small. Also, the second variant
facilitates the use of references as class members, which can be safely passed as argu-
ments to subsequent constructors. When receiving such a reference as argument (as
opposed to a pointer), no validity checks are needed. Consequently, by using excep-
tions, the robustness of object-oriented code (i. e., code that relies on C++ constructors)
can be greatly improved over code that avoids exceptions.

195
7.2 C++ runtime

7.2.2. Bare-metal C++ runtime

Acknowledging the rationale given in the previous section, there is still the problem
of the complexity added by the exception mechanism. For Genode, the complexity of
the trusted computing base is a fundamental metric. The C++ exception mechanism
with its dependency to the C library arguably adds significant complexity. The code
complexity of a C library exceeds the complexity of the fundamental components (such
as the kernel, core, and init) by an order of magnitude. Making the fundamental com-
ponents depend on such a C library would jeopardize one of Genode’s most valuable
assets, which is its low complexity.
To enable the use of C++ exceptions and runtime type information but avoid the
incorporation of an entire C library into the trusted computing base, Genode comes
with a customized C++ runtime that does not depend on a C library. The C++ runtime
libraries are provided by the tool chain, which interface with the symbols provided by
Genode’s C++ support code (repos/base/src/lib/cxx).
Unfortunately, the interface used by the C++ runtime does not reside in a specific
namespace but it is rather a subset of the POSIX API. When linking a real C library
to a Genode component, the symbols present in the C library would collide with the
symbols present in Genode’s C++ support code. For this reason, the C++ runtime
(of the compiler) and Genode’s C++ support code are wrapped in a single library (re-
pos/base/lib/mk/cxx.mk) in a way that all POSIX functions remain hidden. All the refer-
ences of the C++ runtime are resolved by the C++ support code, both wrapped in the
cxx library. To the outside, the cxx library solely exports the CXA ABI as required by
the compiler.

196
7.3 Interaction of core with the underlying kernel

7.3. Interaction of core with the underlying kernel

Core is the root of the component tree. It is initialized and started directly by the under-
lying kernel and has two purposes. First, it makes the low-level physical resources of
the machine available to other components in the form of services. These resources are
physical memory, processing time, device resources, initial boot modules, and protec-
tion mechanisms (such as the MMU, IOMMU, and virtualization extensions). It thereby
hides the peculiarities of the used kernel behind an API that is uniform across all ker-
nels supported by Genode. Core’s second purpose is the creation of the init component
by using its own services and following the steps described in Section 3.5.
Even though core is executed in user mode, its role as the root of the component tree
makes it as critical as the kernel. It just happens to be executed in a different processor
mode. Whereas regular components solely interact with the kernel when performing
inter-component communication, core interplays with the kernel more intensely. The
following subsections go into detail about this interplay.
The description tries to be general across the various kernels supported by Genode.
Note, however, that a particular kernel may deviate from the general description.

7.3.1. System-image assembly

A Genode-based system consists of potentially many boot modules. But boot loaders -
in particular on ARM platforms - usually support the loading of a single system image
only. To unify the boot procedure across kernels and CPU architectures, on all kernels
except Linux, Genode merges boot modules together with the core component into a
single image.
The core component is actually built as a library. The library description file is specific
for each platform and located at lib/mk/spec/<pf>/core.mk where <pf> corresponds to
the hardware platform used. It includes the platform-agnostic lib/mk/core.inc file. The
library contains everything core needs (including the C++ runtime and the core code)
except the following symbols:

_boot_modules_headers_begin and _boot_modules_headers_end Between those


symbols, core expects an array of boot-module header structures. A boot-module
header contains the name, core-local address, and size of a boot module. This
meta data is used by core’s initialization code in src/core/platform.cc to populate
the ROM service with modules.
_boot_modules_binaries_begin and _boot_modules_binaries_end Between those
symbols, core expects the actual module data. This range is outside the core image
(beyond _prog_img_end). In contrast to the boot-module headers, the modules
reside in a separate section that remains unmapped within core’s virtual address
space. Only when access to a boot module is required by core (i. e., the ELF binary

197
7.3 Interaction of core with the underlying kernel

of init during the creation of the init component), core makes the module visible
within its virtual address space.
Making the boot modules invisible to core has two benefits. The integrity of the
boot modules does not depend on core. Even in the presence of a bug in core, the
boot modules cannot be accidentally overwritten. Second, no page-table entries
are needed to map the modules into the virtual address space of core. This is
particularly beneficial when using large boot modules such as a complete disk
image. If incorporated into the core image, page-table entries for the entire disk
image would need to be allocated at the initialization time of core.

These symbols are defined in an assembly file called boot_modules.s. When building
core stand-alone, the final linking stage combines the core library with the dummy
boot_modules.s file located at src/core/boot_modules.s. But when using the run tool (Sec-
tion 5.4.1) to integrate a bootable system image, the run tool dynamically generates a
version of boot_modules.s depending on the boot modules listed in the run script and
repeats the final linking stage of core by combining the core library with the generated
boot_modules.s file. The generated file is placed at <build-dir>/var/run/<scenario>/ and
incorporates the boot modules using the assembler’s .incbin directive. The result of
the final linking stage is an executable ELF binary that contains both core and the boot
modules.

7.3.2. Bootstrapping and allocator setup

At boot time, the kernel passes information about the physical resources and the initial
system state to core. Even though the mechanism and format of this information varies
from kernel to kernel, it generally covers the following aspects:

• A list of free physical memory ranges

• A list of the physical memory locations of the boot modules along with their re-
spective names

• The number of available CPUs

• All information needed to enable the initial thread to perform kernel operations

Core’s allocators Core’s kernel-specific platform initialization code (core/platform.cc)


uses this information to initialize the allocators used for keeping track of physical re-
sources. Those allocators are:

RAM allocator contains the ranges of the available physical memory

I/O memory allocator contains the physical address ranges of unused memory-mapped
I/O resources. In general, all ranges not initially present in the RAM allocator are
considered to be I/O memory.

198
7.3 Interaction of core with the underlying kernel

I/O port allocator contains the I/O ports on x86-based platforms that are currently not
in use. This allocator is initialized with the entire I/O port range of 0 to 0xffff.

IRQ allocator contains the IRQs that are associated with IRQ sessions. This allocator
is initialized with the entirety of the available IRQ numbers.

Core-region allocator contains the virtual memory regions of core that are not in use.

The RAM allocator and core-region allocator are subsumed in the so-called core-
memory allocator. In addition to aggregating both allocators, the core-memory al-
locator allows for the allocation of core-local virtual-memory regions that can be used
for holding core-local objects. Each region allocated from the core-memory allocator
has to satisfy three conditions:

1. It must be backed by a physical memory range (as allocated from the RAM allo-
cator)

2. It must have assigned a core-local virtual memory range (as allocated from the
core-region allocator)

3. The physical-memory range must have the same size as the virtual-memory range

4. The virtual memory range must be mapped to the physical memory range using
the MMU

Internally, the core-memory allocator maintains a so-called mapped-memory allocator


that contains ranges of ready-to-use core-local memory. If a new allocation exceeds the
available capacity, the core-memory allocator expands its capacity by allocating a new
physical memory region from the RAM allocator, allocating a new core-virtual memory
region from the core-region allocator, and installing a mapping from the virtual region
to the physical region.
All memory allocations mentioned above are performed at the granularity of physi-
cal pages, i. e., 4 KiB.
The core-memory allocator is expanded on demand but never shrunk. This makes it
unsuitable for allocating objects on behalf of core’s clients because allocations could not
be reverted when closing a session. It is solely used for dynamic memory allocations at
startup (e. g., the memory needed for keeping the information about the boot modules),
and for keeping meta data for the allocators themselves.

7.3.3. Kernel-object creation

Kernel objects are objects maintained within the kernel and used by the kernel. The
exact notion of what a kernel object represents depends on the actual kernel as the
various kernels differ with respect to the abstractions they provide. Typical kernel ob-
jects are threads and protection domains. Some kernels have kernel objects for memory

199
7.3 Interaction of core with the underlying kernel

mappings while others provide page tables as kernel objects. Whereas some kernels
represent scheduling parameters as distinct kernel objects, others subsume scheduling
parameters to threads. What all kernel objects have in common, though, is that they
consume kernel memory. Most kernels of the L4 family preserve a fixed pool of mem-
ory for the allocation of kernel objects.
If an arbitrary component were able to perform a kernel operation that triggers the
creation of a kernel object, the memory consumption of the kernel would depend on the
good behavior of all components. A misbehaving component may exhaust the kernel
memory.
To counter this problem, on Genode, only core triggers the creation of kernel objects
and thereby guards the consumption of kernel memory. Note, however, that not all
kernels are able to prevent the creation of kernel objects outside of core.

7.3.4. Page-fault handling

Each time a thread within the Genode system triggers a page fault, the kernel reflects
the page fault along with the fault information as a message to the user-level page-fault
handler residing in core. The fault information comprises the identity and instruction
pointer of the faulted thread, the page-fault address, and the fault type (read, write, ex-
ecute). The page-fault handler represents each thread as a so-called pager object, which
encapsulates the subset of the thread’s interface that is needed to handle page faults.
For handling the page fault, the page-fault handler first looks up the pager object that
belongs to the faulting thread’s identity, analogously to how an RPC entrypoint looks
up the RPC object for an incoming RPC request. Given the pager object, the fault is han-
dled by calling the pager function with the fault information as argument. This func-
tion is implemented by the so-called Rm_client (repos/base/src/core/region_map_compo-
nent.cc), which represents the association of the pager object with its virtual address
space (region map). Given the context information about the region map of the thread’s
PD, the pager function looks up the region within the region map, on which the page
fault occurred. The lookup results in one of the following three cases:

Region is populated with a dataspace If a dataspace is attached at the fault address,


the backing store of the dataspace is determined. Depending on the kernel, the
backing store may be a physical page, a core-local page, or another reference to a
physical memory page. The pager function then installs a memory mapping from
the virtual page where the fault occurred to the corresponding part of the backing
store.

Region is populated with a managed dataspace If the fault occurred within a re-
gion where a managed dataspace is attached, the fault handling is forwarded to
the region map that represents the managed dataspace.

200
7.3 Interaction of core with the underlying kernel

Region is empty If no dataspace could be found at the fault address, the fault cannot
be resolved. In this case, core submits an region-map-fault signal to the region
map where the fault occurred. This way, the region-map client has the chance to
detect and possibly respond to the fault. Once the signal handler receives a fault
signal, it is able to query the fault address from the region map. As a response
to the fault, the region-map client may attach a dataspace at this address. This
attach operation, in turn, will prompt core to wake up the thread (or multiple
threads) that faulted within the attached region. Unless a dataspace is attached at
the page-fault address, the faulting thread remains blocked. If no signal handler
for region-map faults is registered for the region map, core prints a diagnostic
message and blocks the faulting thread forever.

To optimize the TLB footprint and the use of kernel memory, region maps do not merely
operate at the granularity of memory pages but on address ranges whose size and align-
ment are arbitrary power-of-two values (at least as large as the size of the smallest phys-
ical page). The source and destinations of memory mappings may span many pages.
This way, depending on the kernel and the architecture, multiple pages may be mapped
at once, or large page-table mappings can be used.

201
7.4 Asynchronous notification mechanism

7.4. Asynchronous notification mechanism

Section 3.6.2 introduces asynchronous notifications (signals) as one of the fundamental


inter-component communication mechanisms. The description covers the semantics of
the mechanism but the question of how the mechanism relates to core and the under-
lying kernel remains unanswered. This section complements Section 3.6.2 with those
implementation details.
Most kernels do not directly support the semantics of asynchronous notifications as
presented in Section 3.6.2. As a reminder, the mechanism has the following features:

• The authority for triggering a signal is represented by a signal-context capabil-


ity, which can be delegated via the common capability-delegation mechanism de-
scribed in Section 3.1.4.

• The submission of a signal is a fire-and-forget operation. The signal producer is


never blocked.

• On the reception of a signal, the signal handler can obtain the context to which
the signal refers. This way, it is able to distinguish different sources of events.

• A signal receiver can wait or poll for potentially many signal contexts. The num-
ber of signal contexts associated with a single signal receiver is not limited.

The gap between this feature set and the mechanisms provided by the underlying ker-
nel is bridged by core as part of the PD service. This service plays the role of a proxy
between the producers and receivers of signals. Each component that interacts with
signals has a session to this service.
Within core, a signal context is represented as an RPC object. The RPC object main-
tains a counter of signals pending for this context. Signal contexts can be created and
destroyed by the clients of the PD service using the alloc_context and free_context RPC
functions. Upon the creation of a signal context, the PD client can specify an integer
value called imprint with a client-local meaning. Later, on the reception of signals, the
imprint value is delivered along with the signal to enable the client to tell the contexts
of the incoming signals apart. As a result of the allocation of a new signal context,
the client obtains a signal-context capability. This capability can be delegated to other
components using the regular capability-delegation mechanism.

Signal submission A component in possession of a signal-context capability is able


to trigger signals using the submit function of its PD session. The submit function takes
the signal context capability of the targeted context and a counter value as arguments.
The capability as supplied to the submit function does not need to originate from the
called session. It may have been created and delegated by another component. Note
that even though a signal context is an RPC object, the submission of a signal is not
realized as an invocation of this object. The signal-context capability is merely used

202
7.4 Asynchronous notification mechanism

as an RPC function argument. This design accounts for the fact that signal-context
capabilities may originate from untrusted peers as is the case for servers that deliver
asynchronous notifications to their clients. A client of such a server supplies a signal-
context capability as argument to one of the server’s RPC functions. An example is
the input stream of the GUI session (Section 4.5.6) that allows the client to get notified
when new user input becomes available. A malicious client may specify a capability
that was not created via core’s PD service but that instead refers to an RPC object local
to the client. If the submit function was an RPC function of the signal context, the
server’s call of the submit RPC function would eventually invoke the RPC object of the
client. This would put the client in a position where it may block the server indefinitely
and thereby make the server unavailable to all clients. In contrast to the untrusted
signal-context capability, the PD session of a signal producer is by definition trusted.
So it is safe to invoke the submit RPC function with the signal-context capability as
argument. In the case where an invalid signal-context capability is delegated to the
signal producer, core will fail to look up a signal context for the given capability and
omit the signal.

Signal reception For receiving signals, a component needs a way to obtain informa-
tion about pending signals from core. This involves two steps: First, the component
needs a way to block until signals are available. Second, if a signal is pending, the com-
ponent needs a way to determine the signal context and the signal receiver associated
with the signal and wake up the thread that blocks the Signal_receiver::block_for_signal
API function.
Both problems are solved by a dedicated thread that is spawned during component
startup. This signal thread blocks at core’s PD service for incoming signals. The block-
ing operation is not directly performed on the PD session but on a decoupled RPC
object called signal source. In contrast to the PD session interface that is kernel agnos-
tic, the underlying kernel mechanism used for blocking the signal thread at the signal
source depends on the used base platform.
The signal-source RPC object implements an RPC interface, on which the PD client
issues a blocking wait_for_signal RPC function. This function blocks as long as no sig-
nal that refers to the session’s signal contexts is pending. If the function returns, the
return value contains the imprint that was assigned to the signal context at its creation
and the number of signals pending for this context. On most base platforms, the im-
plementation of the blocking RPC interface is realized by processing RPC requests and
responses out of order to enable one entrypoint in core to serve all signal sources. Core
uses a dedicated entrypoint for the signal-source handling to decouple the delivery of
signals from potentially long-taking operations of the other core services.
Given the imprint value returned by the signal source, the signal thread determines
the signal context and signal receiver that belongs to the pending signal (using a
data structure called Signal_context_registry) and locally submits the signal to the

203
7.4 Asynchronous notification mechanism

signal-receiver object. This, in turn, unblocks the Signal_receiver::block_for_signal


function at the API level.

204
7.5 Parent-child interaction in detail

Server Parent Client

call capparent
"session requests" ROM session(id, name, args)
updated return

call caprom
update
<session_requests> generate
<create name=...
id=.../>
</session_requests>

return

call capparent notify


deliver id, capsession session response
return

call capparent
session_cap(id)
return
capsession

Figure 47: Parent-child interplay during the creation of a new session. The dotted lines are
asynchronous notifications, which have fire-and-forget semantics. A component that
triggers a signal does not block.

7.5. Parent-child interaction in detail

On a conceptual level, the session-creation procedure as described in Section 3.2.3 ap-


pears as a synchronous interaction between the parent and its child components. The
interaction serves three purposes. First, it is used to communicate information between
different protection domains, in this case the parent, the client, and the server. Second,
it implicitly dictates the flow of control between the involved parties because the caller
blocks until the callee replies. Third, the interplay delegates authority (in particular
authority to access the server’s session object) between protection domains. The latter
is realized with the kernel’s ability to carry capabilities as IPC message payload.
On the surface, the interaction looks like a sequence of synchronous RPC calls. How-
ever, under the hood, the interplay between the parent and its children is based on
a combination of asynchronous notifications from the parent to the children and syn-
chronous RPC from the children to the parent. The protocol is designed such that the
parent’s liveliness remains independent from the behavior of its children, which must
generally be regarded as untrusted from the parent’s perspective. The sequence of cre-

205
7.5 Parent-child interaction in detail

ating a session is depicted in Figure 47. The following points are worth noting:

• Sessions are identified via IDs, which are plain numbers as opposed to capabili-
ties. The IDs as seen by the client and server belong to different ID name spaces.
IDs of sessions requested by the client are allocated by the client. IDs of sessions
requested at the server are allocated by the parent.

• The parent does not issue RPC calls to any of its children.

• Each activation of the parent merely applies a state change of the session’s meta
data structures maintained at the parent, which capture the entire state of session
requests.

• The information about pending session requests is communicated from the parent
to the server via a ROM session. At startup, the server requests a ROM session for
the ROM module “session_requests” from its parent. The parent implements this
ROM session locally. Since ROM sessions support versions, the parent can post
version updates of the “session_requests” ROM with the regular mechanisms al-
ready present in Genode.

• The parties involved can potentially run in parallel.

206
7.6 Dynamic linker

7.6. Dynamic linker

The dynamic linker is a mechanism for loading ELF binaries that are dynamically-
linked against shared libraries.

7.6.1. Building dynamically-linked programs

The build system automatically decides whether a program is linked statically or dy-
namically depending on the use of shared libraries. If the target is linked against at
least one shared library, the resulting ELF image is a dynamically-linked program. Al-
most all Genode components are linked against the Genode application binary inter-
face (ABI), which is a shared library. Therefore, components are dynamically-linked
programs unless a kernel-specific base library is explicitly used.
The entrypoint of a dynamically-linked program is the Component::construct func-
tion.

7.6.2. Startup of dynamically-linked programs

When creating a new component, the parent first detects whether the to-be-loaded ELF
binary represents a statically-linked program or a dynamically-linked program by in-
specting the ELF binary’s program-header information (see repos/base/src/lib/base/elf_bi-
nary.cc). If the program is statically linked, the parent follows the procedure as de-
scribed in Section 3.5. If the program is dynamically linked, the parent remembers the
dataspace of the program’s ELF image but starts the ELF image of the dynamic linker
instead.
The dynamic linker is a regular Genode component that follows the startup proce-
dure described in Section 7.1.2. However, because of its hybrid nature, it needs to take
special precautions before using any data that contains relocations. Because the dy-
namic linker is a shared library, it contains data relocations. Even though the linker’s
code is position independent and can principally be loaded to an arbitrary address,
global data objects may contain pointers to other global data objects or code. For exam-
ple, vtable entries contain pointers to code. Those pointers must be relocated depend-
ing on the load address of the binary. This step is performed by the init_rtld hook
function, which was already mentioned in Section 7.1.2. Global data objects must not
be used before calling this function. For this reason, init_rtld is called at the earliest
possible time directly from the assembly startup code. Apart from the call of this hook
function, the startup of the dynamic linker is the same as for statically-linked programs.
The main function of the dynamic linker obtains the binary of the actual dynamically-
linked program by requesting a ROM session for the module “binary”. The parent re-
sponds to this request by handing out a locally-provided ROM session that contains the
dataspace of the actual program. Once the linker has obtained the dataspace containing
the dynamically-linked program, it loads the program and all required shared libraries.
The dynamic linker requests each shared library as a ROM session from its parent.

207
7.6 Dynamic linker

After completing the loading of all ELF objects, the dynamic linker determines the
entry point of the loaded binary by looking up the Component::construct symbol
and calls it as a function. Note that this particular symbol is ambiguous as both the
dynamic linker and the loaded program have such a function. Hence, the lookup is
performed explicitly on the loaded program.

7.6.3. Address-space management

To load the binary and the associated shared libraries, the linker does not directly at-
tach dataspaces to its address space. Instead, it manages a dedicated part of the com-
ponent’s virtual address space called linker area manually. The linker area is a region
map that is created as part of a PD session. The dynamic linker attaches the linker
area as a managed dataspace to its address space. This way, the linker can precisely
control the layout within the virtual-address range covered by the managed dataspace.
This control is needed because the loading of an ELF object does not correspond to an
atomic attachment of a single dataspace but it involves consecutive attach operations
for multiple dataspaces, one for each ELF segment. When attaching one segment, the
linker must make sure that there is enough space beyond the segment to host the next
segment. The use of a managed dataspace allows the linker to manually allocate large-
enough portions of virtual memory and populate them in multiple steps.

208
7.7 Execution on bare hardware (base-hw)

7.7. Execution on bare hardware (base-hw)

The code specific to the base-hw platform is located within the repos/base-hw/ directory.
In the following description, unless explicitly stated otherwise, all paths are relative to
this directory.
In contrast to classical L4 microkernels where Genode’s core process runs as user-
level roottask on top of the kernel, base-hw executes Genode’s core directly on the
hardware with no distinct kernel underneath. Core and the kernel are melted into one
hybrid component. Although all threads of core are running in privileged processor
mode, they call a kernel library to synchronize hardware interaction. However, most
work is done outside of that library. This design has several benefits. First, the kernel
part becomes much simpler. For example, there are no allocators needed within the
kernel. Second, base-hw side-steps long-standing difficult kernel-level problems, in
particular the management of kernel resources. For the allocation of kernel objects,
the hybrid core/kernel can employ Genode’s user-level resource trading concepts as
described in Section 3.3. Finally and most importantly, merging the kernel with roottask
removes a lot of redundancies between both programs. Traditionally, both kernel and
roottask perform the book keeping of physical-resource allocations and the existence
of kernel objects such as address spaces and threads. In base-hw, those data structures
exist only once. The complexity of the combined kernel/core is significantly lower than
the sum of the complexities of a traditional self-sufficient kernel and a distinct roottask
on top. This way, base-hw helps to make Genode’s TCB less complex.
The following subsections detail the problems that base-hw had to address to become
a self-sufficient base platform for Genode.

7.7.1. Bootstrapping of base-hw

Early bootstrap After the boot loader has loaded the kernel image into memory, it
calls the kernel’s entry point. At this stage, the MMU is still switched off and no CPU
other than the primary boot CPU are initialized. The first job of the loaded kernel is
the initialization of all CPUs and their transition from the use of physical memory to
virtual memory. This one-time code path is called bootstrap. The corresponding code is
located at src/bootstrap/. Besides enabling the MMU, this code performs system-global
static hardware configurations such as setting up an ARM TrustZone policy. Once com-
pleted, bootstrap ELF-loads the actual core/kernel executable, which is designated to
run entirely in virtual memory. After this stage is complete, bootstrap is no longer part
of the picture.

Startup of the base-hw kernel Core on base-hw uses Genode’s regular linker script.
Like any regular Genode component, its execution starts at the _start symbol. But
unlike a regular component, core is started by the bootstrap component as a kernel run-
ning in privileged mode. Instead of directly following the startup procedure described

209
7.7 Execution on bare hardware (base-hw)

in Section 7.1.2, base-hw uses custom startup code that initializes the kernel part of
core first. For example, the startup code for the ARM architecture is located at src/-
core/spec/arm/crt0.s. It calls the kernel initialization code in src/core/kernel/main.cc. Core’s
regular C++ startup code (the _main function) is executed by the first thread created
by the kernel (see the thread setup in the Core_main_thread::Core_main_thread()
constructor).

7.7.2. Kernel entry and exit

The execution model of the kernel can be roughly characterized as a single-stack ker-
nel. In contrast to traditional L4 kernels that maintain one kernel thread per user thread,
the base-hw kernel is a mere state machine that never blocks in the kernel. State tran-
sitions are triggered by core or user-level threads that enter the kernel via a system
call, by device interrupts, or by a CPU exception. Once entered, the kernel applies the
state change depending on the event that caused the kernel entry, and leaves the ker-
nel again. The transition between normal threads and kernel execution depends on the
concrete architecture. For ARM, the corresponding code is located at src/core/spec/ar-
m/exception_vector.s.

7.7.3. Interrupt handling and preemptive multi-threading

In order to respond to interrupts, base-hw has to contain a driver for the interrupt
controller. The interrupt-controller driver is named Board::Pic. The implementation
depends on the used board. For each board supported by the base-hw kernel, there
exists a board.h file that ties together all board-specific definitions. For example, the file
core/board/pbxa9/board.h defines the properties of the PBX-A9 platform. In this case, the
header maps the Board::Pic type to Hw::Gicv2 as defined in hw/spec/arm/gicv2.h. Each
of the Board::Pic drivers implement the same interface.
To support preemptive multi-threading, base-hw requires a hardware timer. The
timer is programmed with the time slice length of the currently executed thread. Once
the programmed timeout elapses, the timer device generates an interrupt that is han-
dled by the kernel. Similarly to interrupt controllers, there exist a variety of different
timer devices for different hardware platforms. Therefore, base-hw features a variety
of timer drivers. The selection of the timer driver for a given board follows the same
pattern as the definition of the Board::Pic type. The board-specific board.h file defines
a Board::Timer type that is mapped to one of the available drivers. For example, the
pbxa9/board.h file includes spec/arm/cortex_a9_global_timer.h, which contains the defini-
tion of Board::Timer.
The in-kernel handler of the timer interrupt invokes the thread scheduler (src/core/k-
ernel/cpu_scheduler.h). The scheduler maintains a list of so-called scheduling contexts
where each context refers to a thread. Each time the kernel is entered, the scheduler is
updated with the passed duration. When updated, it takes a scheduling decision by

210
7.7 Execution on bare hardware (base-hw)

making the next to-be-executed thread the head of the list. At kernel exit, the control is
passed to the user-level thread that corresponds to the head of the scheduler list.

7.7.4. Split kernel interface

The system-call interface of the base-hw kernel is split into two parts. One part is
usable by all components and solely contains system calls for inter-component com-
munication and thread synchronization. The definition of this interface is located at
include/kernel/interface.h. The second part is exposed only to core. It supplements the
public interface with operations for the creation, the management, and the destruction
of kernel objects. The definition of the core-private interface is located at src/core/ker-
nel/core_interface.h.
The distinction between both parts of the kernel interface is enforced by the function
Thread::_call in src/core/kernel/thread.cc.

7.7.5. Public part of the kernel interface

Threads do not run independently but interact with each other via synchronous inter-
component communication as detailed in Section 3.6. Within base-hw, this mechanism
is referred to as IPC (for inter-process communication). To allow threads to perform
calls to other threads or to receive RPC requests, the kernel interface is equipped with
system calls for performing IPC (send_request_msg, await_request_msg, send_reply_msg).
To keep the kernel as simple as possible, IPC is performed using so-called user-level
thread-control blocks (UTCB). Each thread has a corresponding memory page that is
always mapped in the kernel. This UTCB page is used to carry IPC payload. The
largely simplified procedure of transferring a message is as follows. (In reality, the state
space is more complex because the receiver may not be in a blocking state when the
sender issues the message)

1. The sender marshals its payload into its UTCB and invokes the kernel,

2. The kernel transfers the payload from the sender’s UTCB to the receiver’s UTCB
and schedules the receiver,

3. The receiver retrieves the incoming message from its UTCB.

Because all UTCBs are always mapped in the kernel, no page faults can occur during
the second step. This way, the flow of execution within the kernel becomes predictable
and no kernel exception handling code is needed.
In addition to IPC, threads interact via the synchronization primitives provided by
the Genode API. To implement these portions of the API, the kernel provides sys-
tem calls for managing the execution control of threads (stop_thread, restart_thread,
yield_thread).

211
7.7 Execution on bare hardware (base-hw)

To support asynchronous notifications as described in Section 3.6.2, the kernel


provides system calls for the submission and reception of signals (await_signal, can-
cel_next_await_signal, submit_signal, pending_signal, and ack_signal) as well as the life-
time management of signal contexts (kill_signal_context). In contrast to other base
platforms, Genode’s signal API is directly supported by the kernel so that the propa-
gation of signals does not require any interaction with core’s PD service. However, the
creation of signal contexts is arbitrated by the PD service. This way, the kernel objects
needed for the signalling mechanism are accounted to the corresponding clients of the
PD service.
The kernel provides an interface to make the kernel’s scheduling timer available as
time source to the user land. Using this interface, components can bind signal contexts
to timeouts (timeout) and follow the progress of time (time and timeout_max_us).

7.7.6. Core-private part of the kernel interface

The core-private part of the kernel interface allows core to perform privileged opera-
tions. Note that even though the kernel and core provide different interfaces, both are
executed in privileged CPU mode, share the same address space and ultimately trust
each other. The kernel is regarded a mere support library of core that executes those
functions that shall be synchronized between different CPU cores and core’s threads.
In particular, the kernel does not perform any allocation. Instead, the allocation of ker-
nel objects is performed as an interplay of core and the kernel.

1. Core allocates physical memory from its physical-memory allocator. Most kernel-
object allocations are performed in the context of one of core’s services. Hence,
those allocations can be properly accounted to a session quota (Section 3.3). This
way, kernel objects allocated on behalf of core’s clients are “paid for” by those
clients.

2. Core allocates virtual memory to make the allocated physical memory visible
within core and the kernel.

3. Core invokes the kernel to construct the kernel object at the location specified by
core. This kernel invocation is actually a system call that enters the kernel via the
kernel-entry path.

4. The kernel initializes the kernel object at the virtual address specified by core and
returns to core via the kernel-exit path.

The core-private kernel interface consists of the following operations:

• The creation and destruction of protection domains (new_pd, delete_pd), invoked


by the PD service

212
7.7 Execution on bare hardware (base-hw)

• The creation, manipulation, and destruction of threads (new_thread, start_thread,


resume_thread, thread_quota, pause_thread, delete_thread, thread_pager, and _can-
cel_thread_blocking), used by the CPU service and the core-specific back end of
the Genode::Thread API

• The creation and destruction of signal receivers and signal contexts (new_sig-
nal_receiver, delete_signal_receiver, new_signal_context, and delete_signal_context),
invoked by the PD service

• The creation and destruction of kernel-protected object identities (new_obj, delete_obj)

• The creation, manipulation, and destruction of interrupt kernel objects (new_irq,


ack_irq, and delete_irq)

• The mechanisms needed to transfer the flow of control between virtual machines
and virtual-machine monitors (new_vm, delete_vm, run_vm, pause_vm)

7.7.7. Scheduler of the base-hw kernel

CPU scheduling in traditional L4 microkernels is based on static priorities. The sched-


uler always picks the runnable thread with highest priority for execution. If multiple
threads share one priority, the kernel schedules those threads in a round-robin fashion.
Whereas being pretty fast and easy to implement, this scheme has disadvantages: First,
there is no way to prevent high-prioritized threads from starving lower-prioritized
ones. Second, CPU time cannot be granted to threads and passed between them by
the means of quota. To cope with these problems without much loss of performance,
base-hw employs a custom scheduler that deviates from the traditional approach.
The base-hw scheduler introduces the distinction between high-throughput-oriented
scheduling contexts - called fills - and low-latency-oriented scheduling contexts - called
claims. Examples for typical fills would be the processing of a compiler job or the ren-
dering computations of a sophisticated graphics program. They shall obtain as much
CPU time as the system can spare but there is no demand for a high responsiveness.
In contrast, an example for the claim category would be a typical GUI-software stack
covering the control flow from user-input drivers through a chain of GUI components
to the drivers of the graphical output. Another example is a user-level device driver
that must quickly respond to sporadic interrupts but is otherwise untrusted. The low
latency of such components is a key factor for usability and quality of service. Besides
introducing the distinction between claim and fill scheduling contexts, base-hw intro-
duces the notion of a so-called super period, which is a multiple of typical scheduling
time slices, e. g., one second. The entire super period corresponds to 100% of the CPU
time of one CPU. Portions of it can be assigned to scheduling contexts. A CPU quota
thereby corresponds to a percentage of the super period.
At the beginning of a super period, each claim has its full amount of assigned CPU
quota. The priority defines the absolute scheduling order within the super period

213
7.7 Execution on bare hardware (base-hw)

among those claims that are active and have quota left. As long as there exist such
claims, the scheduler stays in the claim mode and the quota of the scheduled claims
decreases. At the end of a super period, the quota of all claims is replenished to the
initial value. Every time the scheduler can’t find an active claim with CPU-quota left, it
switches to the fill mode. Fills are scheduled in a simple round-robin fashion with iden-
tical time slices. The proceeding of the super period doesn’t affect the scheduling order
and time-slices of this mode. The concept of quota and priority that is implemented
through the claim mode aligns nicely with Genode’s way of hierarchical resource man-
agement: Through CPU sessions, each process becomes able to assign portions of its
CPU time and subranges of its priority band to its children without knowing the global
meaning of CPU time or priority.

7.7.8. Sparsely populated core address space

Even though core has the authority over all physical memory, it has no immediate ac-
cess to the physical pages. Whenever core requires access to a physical memory page,
it first has to explicitly map the physical page into its own virtual memory space. This
way, the virtual address space of core stays clean from any data of other components.
Even in the presence of a bug in core (e. g., a dangling pointer), information cannot
accidentally leak between different protection domains because the virtual memory of
the other components is not necessarily visible to core.

7.7.9. Multi-processor support of base-hw

On uniprocessor systems, the base-hw kernel is single-threaded. Its execution model


corresponds to a mere state machine. On SMP systems, it maintains one kernel thread
and one scheduler per CPU core. Access to kernel objects gets fully serialized by one
global spin lock that is acquired when entering the kernel and released when leaving
the kernel. This keeps the use of multiple cores transparent to the kernel model, which
greatly simplifies the code compared to traditional L4 microkernels. Given that the
kernel is a simple state machine providing lightweight non-blocking operations, there
is little contention for the global kernel lock. Even though this claim may not hold up
when scaling to a large number of cores, current platforms can be accommodated well.

Cross-CPU inter-component communication Regarding synchronous and asyn-


chronous inter-processor communication - thanks to the global kernel lock - there
is no semantic difference to the uniprocessor case. The only difference is that on a
multiprocessor system, one processor may change the schedule of another proces-
sor by unblocking one of its threads (e. g., when an RPC call is received by a server
that resides on a different CPU as the client). This condition may rescind the current
scheduling choice of the other processor. To avoid lags in this case, the kernel lets
the unaware target processor trap into an inter-processor interrupt (IPI). The targeted

214
7.7 Execution on bare hardware (base-hw)

processor can react to the IPI by taking the decision to schedule the receiving thread.
As the IPI sender does not have to wait for an answer, the sending and receiving CPUs
remain largely decoupled. There is no need for a complex IPI protocol between sender
and receiver.

TLB shootdown With respect to the synchronization of core-local hardware, there


are two different situations to deal with. Some hardware components like most ARM
caches and branch predictors implement their own coherence protocol and thus need
adaption in terms of configuration only. Others, like the TLBs lack this feature. When
for instance a page table entry gets invalid, the TLB invalidation of the affected entries
must be performed locally by each core. To signal the necessity of TLB maintenance
work, an IPI is sent to all other cores. Once all cores have completed the cleaning, the
thread that invoked the TLB invalidation resumes its execution.

7.7.10. Asynchronous notifications on base-hw

The base-hw platform improves the mechanism described in Section 7.4 by introduc-
ing signal receivers and signal contexts as first-class kernel objects. Core’s PD service is
merely used to arbitrate the creation and destruction of those kernel objects but it does
not play the role of a signal-delivery proxy. Instead, signals are communicated directly
by using the public kernel operations await_signal, cancel_next_await_signal, submit_sig-
nal, and ack_signal.

215
7.8 Execution on the NOVA microhypervisor (base-nova)

7.8. Execution on the NOVA microhypervisor (base-nova)

NOVA is a so-called microhypervisor, denoting the combination of microkernel and


a virtualization platform (hypervisor). It is a high-performance microkernel for the
x86 architecture. In contrast to other microkernels, it has been designed for hardware-
based virtualization via user-level virtual-machine monitors. In line with Genode’s
architecture, NOVA’s kernel interface is based on capability-based security. Hence, the
kernel fully supports the model of a Genode kernel as described in Section 3.1.

NOVA website
https://hypervisor.org

NOVA kernel-interface specification


https://github.com/udosteinberg/NOVA/raw/master/doc/specification.pdf

7.8.1. Integration of NOVA with Genode

The NOVA kernel is available via Genode’s ports mechanism detailed in Section 5.2.
The port description is located at repos/base-nova/ports/nova.port.

Building the NOVA kernel Even though NOVA is a third-party kernel with a custom
build system, the kernel is built directly by the Genode build system. NOVA’s build
system remains unused.
From within a Genode build directory configured for one of the nova_x86_32 or
nova_x86_64 platforms, the kernel can be built via

make kernel

The build description for the kernel is located at repos/base-nova/src/kernel/target.mk.

System-call bindings NOVA is not accompanied with bindings to its kernel inter-
face. There only is a description of the kernel interface in the form of the kernel spec-
ification available. For this reason, Genode maintains the kernel bindings for NOVA
within the Genode source tree. The bindings are located at repos/base-nova/include/ in
the subdirectories nova/, spec/32bit/nova/, and spec/64bit/nova/.

7.8.2. Bootstrapping of a NOVA-based system

After finishing its initialization, the kernel starts the second boot module, the first being
the kernel itself, as root task. The root task is Genode’s core. The virtual address space
of core contains the text and data segments of core, the UTCB of the initial execution
context (EC), and the hypervisor info page (HIP). Details about the HIP are provided in
Section 6 of the NOVA specification.

216
7.8 Execution on the NOVA microhypervisor (base-nova)

BSS section of core The kernel’s ELF loader does not support the concept of a
BSS segment. It simply maps the physical pages of core’s text and data segments
into the virtual memory of core but does not allocate any additional physical pages
for backing the BSS. For this reason, the NOVA version of core does not use the
genode.ld linker script as described in Section 7.1.1 but the linker script located at
repos/base-nova/src/core/core.ld. This version hosts the BSS section within the data seg-
ment. Thereby, the BSS is physically present in the core binary in the form of zero-
initialized data.

Initial information provided by NOVA to core The kernel passes a pointer to the HIP
to core as the initial value of the ESP register. Genode’s startup code saves this value in
the global variable _initial_sp (Section 7.1.2).

7.8.3. Log output on modern PC hardware

Because transmitting information over legacy comports does not require complex de-
vice drivers, serial output over comports is still the predominant way to output low-
level system logs like kernel messages or the output of core’s LOG service.
Unfortunately, most modern PCs lack dedicated comports. This leaves two options
to obtain low-level system logs.

1. The use of vendor-specific platform-management features such as Intel VPro /


Intel Advanced Management Technology (AMT) or Intel Platform Management
Interface (IPMI). These platform features are able to emulate a legacy comport and
provide the serial output over the network. Unfortunately, those solutions are not
uniform across different vendors, difficult to use, and tend to be unreliable.

2. The use of a PCI card or an Express Card that provides a physical comport. When
using such a device, the added comport appears as PCI I/O resource. Because
the device interface is compatible to the legacy comports, no special drivers are
needed.

The latter option allows the retrieval of low-level system logs on hardware that lacks
special management features. In contrast to the legacy comports, however, it has the
minor disadvantage that the location of the device’s I/O resources is not known be-
forehand. The I/O port range of the comport depends on the device-enumeration pro-
cedure of the BIOS. To enable the kernel to output information over this comport, the
kernel must be configured with the I/O port range as assigned by the BIOS on the
specific machine. One kernel binary cannot simply be used across different machines.

The Bender chain boot loader To alleviate the need to adapt the kernel configura-
tion to the used comport hardware, the bender chain boot loader can be used.

217
7.8 Execution on the NOVA microhypervisor (base-nova)

Bender is part of the MORBO tools


https://github.com/TUD-OS/morbo

Instead of starting the NOVA hypervisor directly, the multi-boot-compliant boot loader
(such as GRUB) starts bender as the kernel. All remaining boot modules including the
real kernel have already been loaded into memory by the original boot loader. Bender
scans the PCI bus for a comport device. If such a device is found (e. g., an Express Card),
it writes the information about the device’s I/O port range to a known offset within the
BIOS data area (BDA).
After the comport-device probing is finished, bender passes control to the next boot
module, which is the real kernel. The comport device driver of the kernel does not use
a hard-coded I/O port range for the comport but looks up the comport location in the
BDA. The use of bender is optional. When not used, the BDA always contains the I/O
port range of the legacy comport 1.
The Genode source tree contains a pre-compiled binary of bender at tool/boot/bender.
This binary is automatically incorporated into boot images for the NOVA base platform
when the run tool (Section 5.4.1) is used.

7.8.4. Relation of NOVA’s kernel objects to Genode’s core services

For the terminology of NOVA’s kernel objects, refer to the NOVA specification men-
tioned in the introduction of Section 7.8. A brief glossary for the terminology used in
the remainder of this section is given in table 2.

NOVA term
PD Protection domain
EC Execution context (thread)
SC Scheduling context
HIP Hypervisor information page
IDC Inter-domain call (RPC call)
portal communication endpoint

Table 2: Glossary of NOVA’s terminology

NOVA capabilities are not Genode capabilities Both NOVA and Genode use the
term “capability”. However, the term does not have the same meaning in both contexts.
A Genode capability refers to an RPC object or a signal context. In the context of NOVA,
a capability refers to a NOVA kernel object. To avoid confusing both meanings of the
term, Genode refers to NOVA’s term as “capability selector”, or simply “selector”. A
Genode signal context capability corresponds to a NOVA semaphore, all other Genode
capabilities correspond to NOVA portals.

218
7.8 Execution on the NOVA microhypervisor (base-nova)

PD service A PD session corresponds to a NOVA PD.


A Genode capability being a NOVA portal has a defined IP and an associated local
EC (the Genode entrypoint). The invocation of a such a Genode capability is an IDC
call to a portal. A Genode capability is delegated by passing its corresponding portal
or semaphore selector as IDC argument.
Page faults are handled as explained in Section 7.8.5. Each memory mapping in-
stalled in a component implicitly triggers the allocation of a node in the kernel’s map-
ping database.

CPU service NOVA distinguishes between so-called global ECs and local ECs. A
global EC can be equipped with CPU time by associating it with an SC. It can perform
IDC calls but it cannot receive IDC calls. In contrast to a global EC, a local EC is able to
receive IDC calls but it has no CPU time. A local EC is not executed before it is called
by another EC.
A regular Genode thread is a global EC. A Genode entrypoint is a local EC. Core
distinguishes both cases based on the instruction-pointer (IP) argument of the CPU
session’s start function. For a local EC, the IP is set to zero.

IO_MEM services Core’s RAM and IO_MEM allocators are initialized based on the
information found in NOVA’s HIP.

ROM service Core’s ROM service provides all boot modules as ROM modules. Ad-
ditionally, a copy of NOVA’s HIP is provided as a ROM module named “hypervi-
sor_info_page”.

IRQ service NOVA represents each interrupt as a semaphore created by the kernel.
By registration of a Genode signal context capability via the sigh method of the Irq_ses-
sion interface, the semaphore of the signal context capability is bound to the interrupt
semaphore. Genode signals and NOVA semaphores are handled as described in 7.8.6.
Upon the initial IRQ session’s ack_irq call, a NOVA semaphore-down operation is is-
sued within core on the interrupt semaphore, which implicitly unmasks the interrupt
at the CPU. When the interrupt occurs, the kernel masks the interrupt at the CPU and
performs the semaphore-up operation on the IRQ’s semaphore. Thereby, the chained
semaphore, which is the beforehand registered Genode signal context, is triggered and
the interrupt is delivered as Genode signal. The interrupt gets acknowledged and un-
masked by calling the IRQ session’s ack_irq method.

7.8.5. Page-fault handling on NOVA

On NOVA, each EC has a pre-defined range of portal selectors. For each type of excep-
tion, the range has a dedicated portal that is entered in the event of an exception. The

219
7.8 Execution on the NOVA microhypervisor (base-nova)

page-fault portal of a Genode thread is defined at the creation time of the thread and
points to a pager EC per CPU within core. Hence, for each CPU, a pager EC in core
pages all Genode threads running on the same CPU.

The operation of pager ECs When an EC triggers a page fault, the faulting EC im-
plicitly performs an IDC call to its pager. The IDC message contains the fault infor-
mation. For resolving the page fault, core follows the procedure described in 7.3.4. If
the lookup for a dataspace within the faulter’s region map succeeds, core establishes
a memory mapping into the EC’s PD by invoking the asynchronous map operation of
the kernel and replies to the IDC message. In the case where the region lookup within
the thread’s corresponding region map fails, the faulted thread is retained in a blocked
state via a kernel semaphore. In the event that the fault is later resolved by a region-map
client as described in the paragraph “Region is empty” of Section 7.3.4, the semaphore
gets released, thus resuming the execution of the faulted thread. The faulter will imme-
diately trigger another fault at the same address. This time, however, the region lookup
succeeds.

Mapping database NOVA tracks memory mappings in a data structure called map-
ping database and has the notion of the delegation of memory mappings (rather than the
delegation of memory access). Memory access can be delegated only if the originator of
the delegation has a mapping. Core is the only exception because it can establish map-
pings originating from the physical memory space. Because mappings can be delegated
transitively between PDs, the mapping database is a tree where each node denotes the
delegation of a mapping. The tree is maintained in order to enable the kernel to rescind
the authority. When a mapping is revoked, the kernel implicitly cancels all transitive
mappings that originated from the revoked node.

7.8.6. Asynchronous notifications on NOVA

To support asynchronous notifications as described in Section 3.6.2, we extended the


NOVA kernel semaphores to support signalling via chained NOVA semaphores. This
extension enables the creation of kernel semaphores with a per-semaphore value, which
can be bound to another kernel semaphore. Each bound semaphore corresponds to
a Genode signal context. The per-semaphore value is used to distinguish different
sources of signals.
On this base platform, the blocking of the signal thread at the signal source is re-
alized by using a kernel semaphore shared by the PD session and the PD client. All
chained semaphores (Signal contexts) are bound to this semaphore. When first issuing
a wait-for-signal operation at the signal source, the client requests a capability selector
for the shared semaphore (repos/base-nova/include/signal_session/source_client.h). It then
performs a down operation on this semaphore to block.

220
7.8 Execution on the NOVA microhypervisor (base-nova)

If a signal sender issues a submit operation on a Genode signal capability, then a


regular NOVA kernel semaphore-up syscall is used. If the kernel detects that the used
semaphore is chained to another semaphore, the up operation is delegated to the one
received during the initial wait-for-signal operation of the signal receiving thread.
In contrast to other base platforms, Genode’s signal API is supported by the kernel so
that the propagation of signals does not require any interaction with core’s PD service.
However, the creation of signal contexts is arbitrated by the PD service.

7.8.7. IOMMU support

As discussed in Section 4.1.3, misbehaving device drivers may exploit DMA trans-
actions to circumvent their component boundaries. When executing Genode on the
NOVA microhypervisor, however, bus-master DMA is subjected to the IOMMU.
The NOVA kernel applies a subset of the (MMU) address space of a protection do-
main to the (IOMMU) address space of a device. So the device’s address space can be
managed in the same way as one normally manages the address space of a PD. The
only missing link is the assignment of device address spaces to PDs. This link is pro-
vided by the dedicated system call assign_pci that takes a PD capability selector and a
device identifier as arguments. The PD capability selector represents the authorization
over the protection domain, which is going to be targeted by DMA transactions. The
device identifier is a virtual address where the extended PCI configuration space of the
device is mapped in the specified PD. Only if a user-level device driver has access to
the extended PCI configuration space of the device, is it able to get the assignment in
place.
To make NOVA’s IOMMU support available to Genode, the ACPI driver has the
ability to lookup the extended PCI configuration space region for all devices and re-
ports it via a Genode ROM. The platform driver on x86 evaluates the reported ROM
and uses the information to obtain transparently for platform clients (device drivers)
the extended PCI configuration space per device. The platform driver uses a NOVA-
specific extension (assign_pci) to the PD session interface to associate a PCI device with
a protection domain.
Even though these mechanisms combined should in theory suffice to let drivers op-
erate with the IOMMU enabled, in practice, the situation is a bit more complicated.
Because NOVA uses the same virtual-to-physical mappings for the device as it uses
for the process, the DMA addresses the driver needs to supply to the device must be
virtual addresses rather than physical addresses. Consequently, to be able to make a
device driver usable on systems without IOMMU as well as on systems with IOMMU,
the driver needs to become IOMMU-aware and distinguish both cases. This is an un-
fortunate consequence of the otherwise elegant mechanism provided by NOVA. To re-
lieve the device drivers from worrying about both cases, Genode decouples the virtual
address space of the device from the virtual address space of the driver. The former
address space is represented by a dedicated protection domain called device PD inde-

221
7.8 Execution on the NOVA microhypervisor (base-nova)

pendent from the driver. Its sole purpose is to hold mappings of DMA buffers that are
accessible by the associated device. By using one-to-one physical-to-virtual mappings
for those buffers within the device PD, each device PD contains a subset of the physical
address space. The platform driver performs the assignment of device PDs to PCI de-
vices. If a device driver intends to use DMA, it allocates a new DMA buffer for a specific
PCI device at the platform driver. The platform driver responds to such a request by
allocating a RAM dataspace at core, attaching it to the device PD using the dataspace’s
physical address as virtual address, and by handing out the dataspace capability to the
client. If the driver requests the physical address of the dataspace, the address returned
will be a valid virtual address in the associated device PD. This design implies that
a device driver must allocate DMA buffers at the platform driver (specifying the PCI
device the buffer is intended for) instead of using core’s PD service to allocate buffers
anonymously.

7.8.8. Genode-specific modifications of the NOVA kernel

NOVA is not ready to be used as a Genode base platform as is. This section compiles the
modifications that were needed to meet the functional requirements of the framework.
All modifications are maintained at the following repository:

Genode’s version of NOVA


https://github.com/alex-ab/NOVA.git

The repository contains a separate branch for each version of NOVA that has been used
by Genode. When preparing the NOVA port using the port description at repos/base-
nova/ports/nova.port, the NOVA branch that matches the used Genode version is checked
out automatically. The port description refers to a specific commit ID. The commit his-
tory of each branch within the NOVA repository corresponds to the history of the origi-
nal NOVA kernel followed by a series of Genode-specific commits. Each time NOVA is
updated, a new branch is created and all Genode-specific commits are rebased on top
of the history of the new NOVA version. This way, the differences between the orig-
inal NOVA kernel and the Genode version remain clearly documented. The Genode-
specific modifications solve the following problems:

Destruction of kernel objects


NOVA does not support the destruction of kernel objects. I.e., PDs and ECs can
be created but not destroyed. With Genode being a dynamic system, kernel-object
destruction is a mandatory feature.

Inter-processor IDC
On NOVA, only local ECs can receive IDC calls. Furthermore each local EC
is bound to a particular CPU (hence the name “local EC”). Consequently, syn-
chronous inter-component communication via IDC calls is possible only between

222
7.8 Execution on the NOVA microhypervisor (base-nova)

ECs that both reside on the same CPU but can never cross CPU boundaries. Un-
fortunately, IDC is the only mechanism for the delegation of capabilities. Conse-
quently, authority cannot be delegated between subsystems that reside on differ-
ent CPUs. For Genode, this scheme is too rigid.
Therefore, the Genode version of NOVA introduces inter-CPU IDC calls. When
calling an EC on another CPU, the kernel creates a temporary EC and SC on the
target CPU as a representative of the caller. The calling EC is blocked. The tempo-
rary EC uses the same UTCB as the calling EC. Thereby, the original IDC message
is effectively transferred from one CPU to the other. The temporary EC then per-
forms a local IDC to the destination EC using NOVA’s existing IDC mechanism.
Once the temporary EC receives the reply (with the reply message contained in
the caller’s UTCB), the kernel destroys the temporary EC and SC and unblocks
the caller EC.

Support for priority-inheriting spinlocks


Genode’s lock mechanism relies on a yielding spinlock for protecting the lock
meta data. On most base platforms, there exists the invariant that all threads
of one component share the same CPU priority. So priority inversion within a
component cannot occur. NOVA breaks this invariant because the scheduling pa-
rameters (SC) are passed along IDC call chains. Consequently, when a client calls
a server, the SCs of both client and server reside within the server. These SCs may
have different priorities. The use of a naive spinlock for synchronization will pro-
duce priority inversion problems. The kernel has been extended with the mech-
anisms needed to support the implementation of priority-inheriting spinlocks in
userland.

Combination of capability delegation and translation


As described in Section 3.1.4, there are two cases when a capability is specified as
an RPC argument. The callee may already have a capability referring to the spec-
ified object identity. In this case, the callee expects to receive the corresponding
local name of the object identity. In the other case, when the callee does not yet
have a capability for the object identity, it obtains a new local name that refers to
the delegated capability.
NOVA does not support this mechanism per se. When specifying a capability
selector as map item for an IDC call, the caller has to specify whether a new map-
ping should be created or the translation of the local names should be performed
by the kernel. However, in the general case, this question is not decidable by the
caller. Hence, NOVA had to be changed to take the decision depending on the
existence of a valid translation for the specified capability selector.

Support for deferred page-fault resolution

223
7.8 Execution on the NOVA microhypervisor (base-nova)

With the original version of NOVA, the maximum number of threads is limited
by core’s stack area: NOVA’s page-fault handling protocol works completely syn-
chronously. When a page fault occurs, the faulting EC enters its page-fault portal
and thereby activates the corresponding pager EC in core. If the pager’s lookup
for a matching dataspace within the faulter’s region map succeeds, the page fault
is resolved by delegating a memory mapping as the reply to the page-fault IDC
call. However, if a page fault occurs on a managed dataspace, the pager cannot
resolve it immediately. The resolution must be delayed until the region-map fault
handler (outside of core) responds to the fault signal. In order to enable core to
serve page faults of other threads in the meantime, each thread has its dedicated
pager EC in core.
Each pager EC, in turn, consumes a slot in the stack area within core. Since core’s
stack area is limited, the maximum number of ECs within core is limited too. Be-
cause one core EC is needed as pager for each thread outside of core, the available
stacks within core become a limited resource shared by all CPU-session clients.
Because each Genode component is a client of core’s CPU service, this bounded
resource is effectively shared among all components. Consequently, the alloca-
tion of threads on NOVA’s version of core represents a possible covert storage
channel.
To avoid the downsides described above, we extended the NOVA IPC reply sys-
tem call to specify an optional semaphore capability selector. The NOVA kernel
validates the capability selector and blocks the faulting thread in the semaphore.
The faulted thread remains blocked even after the pager has replied to the fault
message. But the pager immediately becomes available for other page-fault re-
quests. With this change, it suffices to maintain only one pager thread per CPU
for all client threads.
The benefits are manifold. First, the base-nova implementation converges more
closely to other Genode base platforms. Second, core can not run out of threads
anymore as the number of threads in core is fixed for a given setup. And the third
benefit is that the helping mechanism of NOVA can be leveraged for concurrently
faulting threads.

Remote revocation of memory mappings In the original version of NOVA, roottask


must retain mappings to all memory used throughout the system. In order to
be able to delegate a mapping to another PD as response of a page fault, it must
possess a local mapping of the physical page. Otherwise, it would not be able to
revoke the mapping later on because the kernel expects roottask’s mapping node
as a proof of the authorization for the revocation of the mapping. Consequently,
even though roottask never touches memory handed out to other components, it
needs to have memory mappings with full access rights installed within its virtual
address space.

224
7.8 Execution on the NOVA microhypervisor (base-nova)

To relieve Genode’s roottask (core) from the need to keep local mappings for all
memory handed out to other components and thereby let core benefit from a
sparsely populated address space as described in Section 7.7.8 for base-hw, we
changed the kernel’s revoke operation to take a PD selector and a virtual address
within the targeted PD as argument. By presenting the PD selector as a token
of authorization over the entire PD, we do no longer need core-locally installed
mappings as the proof of authorization. Hence, memory mappings can always be
installed directly from the physical address space to the target PD.

Support for write-combined access to memory-mapped I/O resources The origi-


nal version of NOVA is not able to benefit from write combining because the
kernel interface does not allow the userland to specify cacheability attributes for
memory mappings. To achieve good throughput to the framebuffer, write com-
bining is crucial. Hence, we extended the kernel interface to allow the userland to
propagate cacheability attributes to the page-table entries of memory mappings
and set up the x86 page attribute table (PAT) with a configuration for write com-
bining.

Support for the virtualization of 64-bit guest operating systems The original ver-
sion of NOVA supports 32-bit guest operations only. We enhanced the kernel to
also support 64-bit guests.

Resource quotas for kernel resources The NOVA kernel lacks the ability to adopt
the kernel memory pool to the behavior of the userland. The kernel memory
pool has a fixed size, which cannot be changed at runtime. Even though we have
not removed this principal limitation, we extended the kernel with the ability
to subject kernel-memory allocations to a userlevel policy at the granularity of
PDs. Each kernel operation that consumes kernel memory is accounted to a PD
whereas each PD has a limited quota of kernel memory. This measure prevents
arbitrary userland programs to bring down the entire system by exhausting the
kernel memory. The reach of damage is limited to the respective PD.

Asynchronous notification mechanism We extended the NOVA kernel semaphores


to support signalling via chained NOVA semaphores. This extension enables the
creation of kernel semaphores with a per-semaphore value, which can be bound
to another kernel semaphore. Each bound semaphore corresponds to a Genode
signal context. The per-semaphore value is used to distinguish different sources
of signals. Now, a signal sender issues a submit operation on a Genode signal
capability via a regular NOVA semaphore-up syscall. If the kernel detects that the
used semaphore is chained to another semaphore, the up operation is delegated
to the chained one. If a thread is blocked, it gets woken up directly and the per-
semaphore value of the bound semaphore gets delivered. In case no thread is

225
7.8 Execution on the NOVA microhypervisor (base-nova)

currently blocked, the signal is stored and delivered as soon as a thread issues the
next semaphore-down operation.
Chaining semaphores is an operation that is limited to a single level, which avoids
attacks targeting endless loops in the kernel. The creation of such signals can
solely be performed if the issuer has a NOVA PD capability with the semaphore-
create permission set. On Genode, this effectively reserves the operation to core.
Furthermore, our solution preserves the invariant of the original NOVA kernel
that a thread may be blocked in only one semaphore at a time.

Interrupt delivery We applied the same principle of the asynchronous notification ex-
tension to the delivery of interrupts by the NOVA kernel. Interrupts are delivered
as ordinary Genode signals, which alleviate of the need for one thread per inter-
rupt as required by the original NOVA kernel. The interrupt gets directly deliv-
ered to the address space of the driver in case of a Message Signalled Interrupt
(MSI), or in case of a shared interrupt, to the x86 platform driver.

7.8.9. Known limitations of NOVA

This section summarizes the known limitations of NOVA and the NOVA version of
core.

Fixed amount of kernel memory NOVA allocates kernel objects out of a memory
pool of a fixed size. The pool is dimensioned in the kernel’s linker script no-
va/src/hypervisor.ld (at the symbol _mempool_f).

Bounded number of object capabilities within core For each capability created
via core’s PD service, core allocates the corresponding NOVA portal or NOVA
semaphore and maintains the capability selector during the lifetime of the associ-
ated object identity. Each allocation of a capability via core’s PD service consumes
one entry in core’s capability space. Because the space is bounded, clients of the
service could misuse core’s capability space as covert storage channel.

226
Part II.
Reference

227
8. API

This chapter complements the architectural information of Chapters 3 and 4 with a


thorough description of the framework’s C++ programming interface. The material is
partially generated from the source code, specifically the public header files of the base
and os source-code repositories. The location of the headers in either both reposito-
ries depends on role of the respective header. Only if the interface is fundamentally
required by the core component or by the base framework itself, it is contained in the
base repository. Otherwise, the header is found in the os repository.

Scope of the API The Genode API covers everything needed by Genode to be self-
sufficient without any dependency on 3rd-party code. It does not even depend on a C
runtime. It is possible to create components that solely use the raw Genode API, as is
the case for the ones hosted within the repos/os/ repository. Because such components
do not rely on any library code except for the low-complexity Genode base libraries,
they are relatively easy to evaluate. That said, the API must not be mistaken as an
application-level API. It does not present an alternative to established application-level
libraries like the standard C++ library. For example, even though the framework con-
tains a few utilities for processing strings, those utilities do merely exist to support the
use cases present in Genode.

General conventions The functional specification adheres to the following general


conventions:

• Static member functions are called class functions. All other member functions are
called methods.

• Both structs and classes are referred to as classes. There is no distinction between
both at the API level.

• Constant functions with no argument and a return value are called accessors. Tech-
nically, they are “getters” but since Genode does not have any “setters”, accessors
are synonymous to “getters”. Because those functions are so simple, they are
listed in the class overview and are not described separately.

• Classes that merely inherit other classes are called sub types. In contrast to the
use of a typedef, such classes are not mere name aliases but represent a new
type. The type is compatible to the base class but not vice versa. For example,
exception types are sub types of the Exception class.

• Namespace-wide information is presented with a light yellow background.

• Each non-trivial class is presented in the form of a class diagram on a blue-shaded


background. The diagram shows the public base classes and the list of public

228
class functions and methods. If the class is a template, the template arguments
are annotated at the top-right of corner of the class. If the class represents an RPC
interface, an information box is attached. The methods listed in the diagram can
be used as hyperlinks to further descriptions if available.

• Method descriptions are presented on a green-shaded background, global func-


tions are presented on an orange-shaded background.

• The directory path presented under “Header” sections is a hyperlink to the corre-
sponding source code at GitHub.

229
8.1 API primitives

8.1. API primitives

8.1.1. Capability types

As described in Section 3.1, inter-component communication is based on capabilities. A


capability refers to a system-wide unique object identity and can be delegated among
components. At API level, each capability is associated with the type of the RPC in-
terface the capability refers to - similar to how a C++ reference refers to the type of a
specific C++ object.

Genode Capability
Capability referring to a specific RPC interface class template

Untyped_capability

RPC_INTERFACE
Capability

Capability(. . .)
Capability()

Template argument

RPC_INTERFACE typename
Class containing the RPC interface declaration

Header
repos/base/include/base/capability.h

Genode Capability Capability


constructor template
Template argument

FROM_RPC_INTERFACE typename

Argument

cap Capability<FROM_RPC_INTERFACE> const &

This implicit constructor checks at compile time for the compatibility of the source and
target capability types. The construction is performed only if the target capability type
is identical to or a base type of the source capability type.

230
8.1 API primitives

Genode Capability Capability

Default constructor creates invalid capability constructor

231
8.1 API primitives

8.1.2. Sessions and connections

Servers provide their services over session-based communication channels. A Session


type is defined as an abstract interface inherited from the Session base class.

Genode Session
Base class of session interfaces class

Session

~Session()

Header
repos/base/include/session/session.h

Each session interface has to provide an implementation of the following class func-
tion that returns the name of the service as constant string.

static const char *service_name();

This function is used by the framework for the announcement of the service’s root
interface at the component’s parent. The string returned by this function corresponds
to the service name as used in the system configuration (Section 6).
The interaction of a client with a server involves the definition of session-construction
arguments, the request of the session creation via its parent, the initialization of the
matching RPC-client stub code with the received session capability, the actual use of
the session interface, and the closure of the session. The Connection template class
provides a way to greatly simplify the handling of session arguments, session creation,
and destruction on the client side. By implementing a service-specific connection class
inherited from Connection, session arguments become plain constructor arguments,
session functions can be called directly on the Connection object, and the session gets
properly closed when destructing the Connection.

232
8.1 API primitives

Genode Connection
Representation of an open connection to a service class template

Connection_base

SESSION_TYPE
Connection

Connection(. . .)
Connection(. . .)
~Connection()
cap() : Capability<SESSION_TYPE>

Template argument

SESSION_TYPE typename

Header
repos/base/include/base/connection.h

Genode Connection Connection


constructor
Arguments

env Env &

label Session_label const &

ram_quota Ram_quota const &

affinity Affinity const &

args Args const &

Genode Connection Connection


constructor
Arguments

env Env &

label Session_label const &

ram_quota Ram_quota const &

args Args const &

Shortcut for the common case where the affinity is not specified.

233
8.1 API primitives

Genode Connection cap


const method
Return value
Capability<SESSION_TYPE> Session capability

234
8.1 API primitives

8.1.3. Dataspace interface

The dataspace abstraction described in Section 3.4.1 is the fundamental API primitive
for representing a container of memory as provided by core’s PD, IO_MEM, or ROM
services. Each dataspace is referenced by a capability that can be passed among com-
ponents. Each component with the capability to a dataspace can access the dataspace’s
content by attaching the dataspace to the region map of its PD session. In addition
to the use as arguments for region-map operations, dataspaces provide the following
interface.
Genode Dataspace
Dataspace interface class

Interface

Dataspace

~Dataspace() RPC interface


size() : size_t
writeable() : bool

Header
repos/base/include/dataspace/dataspace.h

Genode Dataspace size

Request size of dataspace pure virtual method

Return value
size_t

Genode Dataspace writeable


pure virtual method
Return value
bool True if dataspace is writeable

Attached dataspace As a utility for handling the common case where a dataspace
is attached to the component’s address space as long as a certain object (like a session)
exists, an instance of an Attached_dataspace can be hosted as a member variable.
When destructed, the dataspace will be automatically detached from the component’s

235
8.1 API primitives

address space.

Genode Attached_dataspace
Utility to attach a dataspace to the local address space class

Attached_dataspace

Attached_dataspace(. . .)
~Attached_dataspace()
cap() : Dataspace_capability
local_addr() : T *
local_addr() : T const *
size() : size_t
invalidate()

Header
repos/base/include/base/attached_dataspace.h

Genode Attached_dataspace Attached_dataspace


constructor
Arguments

rm Region_map &

ds Dataspace_capability

Exceptions

Region_map::Region_conflict
Region_map::Invalid_dataspace
Out_of_caps
Out_of_ram

Genode Attached_dataspace cap


const method
Return value
Dataspace_capability Capability of the used dataspace

236
8.1 API primitives

Genode Attached_dataspace local_addr

Request local address method template

Template argument

T typename

Return value
T *

This is a template to avoid inconvenient casts at the caller. A newly attached dataspace
is untyped memory anyway.
Genode Attached_dataspace local_addr
const method template
Template argument

T typename

Return value
T const *

Genode Attached_dataspace size


const method
Return value
size_t Size

Genode Attached_dataspace invalidate

Forget dataspace, thereby skipping the detachment on destruc- method


tion

This method can be called if the the dataspace is known to be physically destroyed,
e. g., because the session where the dataspace originated from was closed. In this case,
core will already have removed the memory mappings of the dataspace. So we have to
omit the detach operation in ~Attached_dataspace.

237
8.2 Component execution environment

8.2. Component execution environment

Each component has to provide an implementation of the Component interface as illus-


trated by the example given in Section 2.5.2.

Component
namespace
Hook functions for bootstrapping the component
Functions
• stack_size() : Genode::size_t
• construct(. . .)
Header
repos/base/include/base/component.h

Component stack_size
global function
Return value
Genode::size_t Stack size of the component’s initial entrypoint

Component construct

Construct component global function

Argument

env Env &


Interface to the component’s execution environment

Genode
namespace
Hook functions for bootstrapping the component
Header
repos/base/include/base/component.h

8.2.1. Interface to the component’s environment

As described in Section 3.5, each component consists of a protection domain (PD ses-
sion), a LOG session, a ROM session with the component’s executable binary, and a
CPU session, from which the main thread is created. These sessions form the envi-
ronment of the component, which is represented by the Env interface class. The en-
vironment is provided to the component as argument to the Component::construct

238
8.2 Component execution environment

function.
Genode Env
Component environment class

Interface

Env

parent() : Parent &


cpu() : Cpu_session &
rm() : Region_map &
pd() : Pd_session &
ram() : Ram_allocator &
ep() : Entrypoint &
cpu_session_cap() : Cpu_session_capability
pd_session_cap() : Pd_session_capability
id_space() : Id_space<Parent::Client> &
session(. . .) : Session_capability
session(. . .) : Capability<SESSION_TYPE>
upgrade(. . .)
close(. . .)
exec_static_constructors()
try_session(. . .) : Session_capability

Header
repos/base/include/base/env.h

Genode Env parent


pure virtual method
Return value
Parent &

Genode Env cpu

CPU session of the component pure virtual method

Return value
Cpu_session &

This session is used to create the threads of the component.

239
8.2 Component execution environment

Genode Env rm

Region map of the component’s address space pure virtual method

Return value
Region_map &

Genode Env pd

PD session of the component as created by the parent pure virtual method

Return value
Pd_session &

Genode Env ram

Memory allocator method

Return value
Ram_allocator &

Genode Env ep

Entrypoint for handling RPC requests and signals pure virtual method

Return value
Entrypoint &

Genode Env cpu_session_cap


pure virtual method
Return value
Cpu_session_capability The CPU-session capability of the component

Genode Env pd_session_cap


pure virtual method
Return value
Pd_session_capability The PD-session capability of the component

Genode Env id_space

ID space of sessions obtained from the parent pure virtual method

Return value
Id_space<Parent::Client> &

240
8.2 Component execution environment

Genode Env session

Create session with quota upgrades as needed pure virtual method

Arguments

- Service_name const &

- Id

- Session_args const &

- Affinity const &

Exception

Service_denied

Return value
Session_capability

In contrast to try_session, this method implicitly handles Insufficient_ram_quota


and Insufficient_cap_quota by successively increasing the session quota. On the
occurrence of an Out_of_ram or Out_of_caps exception, a resource request is issued
to the parent.

241
8.2 Component execution environment

Genode Env session

Create session to a service method template


Template argument

SESSION_TYPE typename
Session interface type

Arguments

id Id
Session ID of new session
args Session_args const &
Session constructor arguments
affinity Affinity const &
Preferred CPU affinity for the session

Exception

Service_denied

Return value
Capability<SESSION_TYPE>

See the documentation of Parent::session.


This method blocks until the session is available or an error occurred.
Genode Env upgrade

Upgrade session quota pure virtual method

Arguments

id Id
ID of recipient session
args Upgrade_args const &
Description of the amount of quota to transfer

Exceptions

Out_of_ram
Out_of_caps

See the documentation of Parent::upgrade.


The args argument has the same principle format as the args argument of the session
operation.

242
8.2 Component execution environment

Genode Env close

Close session and block until the session is gone pure virtual method

Argument

- Id

Genode Env exec_static_constructors

Excecute pending static constructors pure virtual method

On component startup, the dynamic linker does not call possible static constructors in
the binary and shared libraries the binary depends on. If the component requires static
construction it needs to call this function at construction time explicitly. For example,
the libc implementation executes this function before constructing libc components.
Genode Env try_session

Attempt the creation of a session pure virtual method

Arguments

- Service_name const &

- Id

- Session_args const &

- Affinity const &

Exceptions

Service_denied
Insufficient_cap_quota
Insufficient_ram_quota
Out_of_caps
Out_of_ram

Return value
Session_capability

243
8.2 Component execution environment

8.2.2. Parent interface

At its creation time, the only communication partner of a component is its immediate
parent. The parent can be reached via interface returned by Env::parent().

Genode Parent
Parent interface class

Parent

~Parent()
exit(. . .)
announce(. . .)
announce(. . .)
announce(. . .)
session_sigh(. . .)
session(. . .) : Session_capability
session_cap(. . .) : Session_capability
upgrade(. . .) : Upgrade_result
RPC interface
close(. . .) : Close_result
session_response(. . .)
deliver_session_cap(. . .)
main_thread_cap() : Thread_capability
resource_avail_sigh(. . .)
resource_request(. . .)
yield_sigh(. . .)
yield_request() : Resource_args
yield_response()
heartbeat_sigh(. . .)
heartbeat_response()

Header
repos/base/include/parent/parent.h

244
8.2 Component execution environment

Genode Parent exit

Tell parent to exit the program pure virtual method

Argument

exit_value int

Genode Parent announce

Announce service to the parent pure virtual method

Argument

service_name Service_name const &

Genode Parent announce

Emulation of the original synchronous root interface class function

Arguments

service_name Service_name const &

service_root Root_capability

This method transparently spawns a proxy “root” entrypoint that dispatches asyn-
chronous session-management operations (as issued by the parent) to the local root
interfaces via component-local RPC calls.
The method solely exists for API compatibility.
Genode Parent announce

Announce service to the parent method template

Template argument

ROOT_INTERFACE typename

Argument

service_root Capability<ROOT_INTERFACE> const &


Root capability

The type of the specified service_root capability match with an interface that pro-
vides a Session_type type (i. e., a Typed_root interface). This Session_type is ex-
pected to host a class function called service_name returning the name of the provided
interface as null-terminated string.

245
8.2 Component execution environment

Genode Parent session_sigh

Register signal handler for session notifications pure virtual method

Argument

- Signal_context_capability

Genode Parent session

Create session to a service pure virtual method


Arguments

id Id
Client-side ID of new session
service_name Service_name const &
Name of the requested interface
args Session_args const &
Session constructor arguments
affinity Affinity const &
Preferred CPU affinity for the session
Default is Affinity()

Exceptions

Service_denied Parent denies session request


Insufficient_cap_quota Donated cap quota does not suffice
Insufficient_ram_quota Donated RAM quota does not suffice
Out_of_caps Session CAP quota exceeds our resources
Out_of_ram Session RAM quota exceeds our resources

Return value
Session_capability Session capability if the new session is immediately available,
or an invalid capability

If the returned capability is invalid, the request is pending at the server. The parent de-
livers a signal to the handler as registered via session_sigh once the server responded
to the request. Now the session capability can be picked up by calling session_cap.

246
8.2 Component execution environment

Genode Parent session_cap

Request session capability pure virtual method

Argument

id Id

Exceptions

Service_denied
Insufficient_cap_quota
Insufficient_ram_quota

Return value
Session_capability

See session for more documentation.


In the exception case, the parent implicitly closes the session.
Genode Parent upgrade

Transfer our quota to the server that provides the specified ses- pure virtual method
sion
Arguments

to_session Id

args Upgrade_args const &


Description of the amount of quota to transfer

Exceptions

Out_of_caps
Out_of_ram

Return value
Upgrade_result

The args argument has the same principle format as the args argument of the session
operation.

247
8.2 Component execution environment

Genode Parent close

Close session pure virtual method


Argument

- Id

Return value
Close_result

Genode Parent session_response

Set state of a session provided by the child service pure virtual method

Arguments

- Id

- Session_response

Genode Parent deliver_session_cap

Deliver capability for a new session provided by the child service pure virtual method

Arguments

- Id

- Session_capability

Genode Parent main_thread_cap

Provide thread_cap of main thread pure virtual const method

Return value
Thread_capability

Genode Parent resource_avail_sigh

Register signal handler for resource notifications pure virtual method

Argument

sigh Signal_context_capability

248
8.2 Component execution environment

Genode Parent resource_request

Request additional resources pure virtual method

Argument

args Resource_args const &

By invoking this method, a component is able to inform its parent about the need for
additional resources. The argument string contains a resource description in the same
format as used for session-construction arguments. In particular, for requesting ad-
ditional RAM quota, the argument looks like “ram_quota=<amount>" where amount
is the amount of additional resources expected from the parent. If the parent com-
plies with the request, it submits a resource-available signal to the handler registered
via resource_avail_sigh(). On the reception of such a signal, the component can
re-evaluate its resource quota and resume execution.
Genode Parent yield_sigh

Register signal handler for resource yield notifications pure virtual method

Argument

sigh Signal_context_capability

Using the yield signal, the parent is able to inform the component about its wish to re-
gain resources.
Genode Parent yield_request

Obtain information about the amount of resources to free pure virtual method
Return value
Resource_args

The amount of resources returned by this method is the goal set by the parent. It
is not commanded but merely meant as a friendly beg to cooperate. The compo-
nent is not obligated to comply. If the component decides to take action to free re-
sources, it can inform its parent about the availability of freed up resources by calling
yield_response().

249
8.2 Component execution environment

Genode Parent yield_response

Notify the parent about a response to a yield request pure virtual method

Genode Parent heartbeat_sigh

Register heartbeat handler pure virtual method

Argument

sigh Signal_context_capability

The parent may issue heartbeat signals to the child at any time and expects a call of
the heartbeat_response RPC function as response. When oberving the RPC call, the
parent infers that the child is still able to respond to external events.
Genode Parent heartbeat_response

Deliver response to a heartbeat signal pure virtual method

250
8.3 Entrypoint

8.3. Entrypoint

An entrypoint is a thread that is able to respond to RPC requests and signals. Each
component has at least one initial entrypoint that is created as part of the component’s
environment. It can be accessed via the Env::ep() method.
The Entrypoint::manage and Entrypoint::dissolve methods are used to asso-
ciate respectively disassociate signal handlers and RPC objects with the entrypoint.
Under the hood, those operations interact with the component’s PD session in order
to bind the component’s signal handlers and RPC objects to capabilities.
Note that the current version of the Entrypoint utilizes the former ’Rpc_entrypoint’ and
Signal_receiver APIs. The entrypoint is designated to eventually replace those APIs.
Hence, application code should no longer use the Rpc_entrypoint and Signal_receiver
directly.

Genode Entrypoint
Entrypoint for serving RPC requests and dispatching signals class

Entrypoint

Entrypoint(. . .)
~Entrypoint()
manage(. . .) : Capability<RPC_INTERFACE>
dissolve(. . .)
manage(. . .) : Signal_context_capability
dissolve(. . .)
dispatch_pending_io_signal() : bool
rpc_ep() : Rpc_entrypoint &
register_io_progress_handler(. . .)

Header
repos/base/include/base/entrypoint.h

251
8.3 Entrypoint

Genode Entrypoint Entrypoint


constructor
Arguments

env Env &

stack_size size_t

name char const *

- Location

Genode Entrypoint manage

Associate RPC object with the entry point method template

Template arguments

RPC_INTERFACE typename

RPC_SERVER typename

Argument

obj Rpc_object<RPC_INTERFACE, RPC_SERVER> &

Return value
Capability<RPC_INTERFACE>

Genode Entrypoint dissolve

Dissolve RPC object from entry point method template

Template arguments

RPC_INTERFACE typename

RPC_SERVER typename

Argument

obj Rpc_object<RPC_INTERFACE, RPC_SERVER> &

252
8.3 Entrypoint

Genode Entrypoint manage

Associate signal dispatcher with entry point method

Argument

- Signal_dispatcher_base &

Return value
Signal_context_capability

Genode Entrypoint dissolve

Disassociate signal dispatcher from entry point method

Argument

- Signal_dispatcher_base &

Genode Entrypoint dispatch_pending_io_signal

Dispatch single pending I/O-level signal (non-blocking) method

Return value
bool True if a pending signal was dispatched, false if no signal was pending

Genode Entrypoint rpc_ep


method
Return value
Rpc_entrypoint & RPC entrypoint

Genode Entrypoint register_io_progress_handler

Register hook functor to be called after I/O signals are dis- method
patched
Argument

handler Io_progress_handler &

253
8.4 Region-map interface

8.4. Region-map interface

A region map represents a (portion of) a virtual memory address space (Section 3.4.2).
By attaching dataspaces to a region map, the content of those dataspaces becomes vis-
ible in the component’s virtual address space. Furthermore, a region map can be used
as a dataspace. Thereby, the nesting of region maps becomes possible. This allows a
component to manage portions of its own virtual address space manually as done for
the stack area and linker area (Section 8.5.1). The use of region maps as so-called man-
aged dataspaces makes it even possible to manage portions of the virtual address space
of other components (Section 8.5.3).

Genode Region_map
Region map interface class

Interface

Region_map

attach(. . .) : Local_addr
attach_at(. . .) : Local_addr
attach_executable(. . .) : Local_addr
RPC interface
attach_rwx(. . .) : Local_addr
detach(. . .)
fault_handler(. . .)
state() : State
dataspace() : Dataspace_capability

Header
repos/base/include/region_map/region_map.h

254
8.4 Region-map interface

Genode Region_map attach

Map dataspace into region map pure virtual method

Arguments

ds Dataspace_capability
Capability of dataspace to map
size size_t
Size of the locally mapped region default (0) is the whole dataspace
Default is 0
offset off_t
Start at offset in dataspace (page-aligned)
Default is 0
use_local_addr bool
If set to true, attach the dataspace at the specified local_addr
Default is false
local_addr Local_addr
Local destination address
Default is (void *)0
executable bool
If the mapping should be executable
Default is false
writeable bool
If the mapping should be writeable
Default is true

Exceptions

Invalid_dataspace
Region_conflict
Out_of_ram RAM quota of meta-data backing store is exhausted
Out_of_caps Cap quota of meta-data backing store is exhausted

Return value
Local_addr Address of mapped dataspace within region map

255
8.4 Region-map interface

Genode Region_map attach_at

Shortcut for attaching a dataspace at a predefined local address method

Arguments

ds Dataspace_capability

local_addr addr_t

size size_t
Default is 0
offset off_t
Default is 0

Return value
Local_addr

Genode Region_map attach_executable

Shortcut for attaching a dataspace executable at local address method

Arguments

ds Dataspace_capability

local_addr addr_t

size size_t
Default is 0
offset off_t
Default is 0

Return value
Local_addr

256
8.4 Region-map interface

Genode Region_map attach_rwx

Shortcut for attaching a dataspace will full rights at local address method

Arguments

ds Dataspace_capability

local_addr addr_t

size size_t
Default is 0
offset off_t
Default is 0

Return value
Local_addr

Genode Region_map detach

Remove region from local address space pure virtual method

Argument

local_addr Local_addr

Genode Region_map fault_handler

Register signal handler for region-manager faults pure virtual method

Argument

handler Signal_context_capability

On Linux, this signal is never delivered because page-fault handling is performed by


the Linux kernel. On microkernel platforms, unresolvable page faults (traditionally
called segmentation fault) will result in the delivery of the signal.
Genode Region_map state

Request current state of region map pure virtual method

Return value
State

Genode Region_map dataspace


pure virtual method
Return value
Dataspace_capability Dataspace representation of region map

257
8.4 Region-map interface

258
8.5 Session interfaces of the base API

8.5. Session interfaces of the base API

8.5.1. PD session interface

The protection-domain (PD) service (Section 3.4.4) enables the creation of address
spaces that are isolated from each other. Each PD session corresponds to a protection
domain. Each PD session is equipped with three region maps, one representing the
PD’s virtual address space, one for the component’s stack area, and one for the compo-
nent’s linker area designated for shared objects. Of those region maps, only the virtual
address space is needed by application code directly. The stack area and linker area
are used by the framework internally. The PD service is rarely needed by applications
directly but it is internally.
The PD session also represents a resource partition with a budget of physical memory
and capabilities. Within the bounds of its RAM budget, it allows the client to allocate
physical memory in the form of dataspaces.
Analogously, within the bounds of its capability budget, capabilities can be allocated
and associated with a local object or a signal handler. Such a capability can be passed
to another protection domain via an RPC call whereby the receiving protection domain
obtains the right to send messages to the associated object or to trigger the associated
signal handler respectively. The capability-management operations are not used di-
rectly by components at the API level but are used indirectly via the RPC mechanism
described in Section 8.15 and the signalling API described in Section 8.14.

259
8.5 Session interfaces of the base API

Genode Pd_session
Protection domain (PD) session interface class

Session Ram_allocator

Pd_session

~Pd_session()
assign_parent(. . .)
assign_pci(. . .) : bool
map(. . .)
alloc_signal_source() : Signal_source_capability
free_signal_source(. . .)
alloc_context(. . .) : Capability<Signal_context>
free_context(. . .)
submit(. . .)
alloc_rpc_cap(. . .) : Native_capability
free_rpc_cap(. . .)
address_space() : Capability<Region_map>
stack_area() : Capability<Region_map> RPC interface
linker_area() : Capability<Region_map>
ref_account(. . .)
transfer_quota(. . .)
cap_quota() : Cap_quota
used_caps() : Cap_quota
avail_caps() : Cap_quota
transfer_quota(. . .)
ram_quota() : Ram_quota
used_ram() : Ram_quota
avail_ram() : Ram_quota
native_pd() : Capability<Native_pd>
system_control_cap(. . .) : Capability<System_control>
dma_addr(. . .) : addr_t
attach_dma(. . .) : Attach_dma_result

Header
repos/base/include/pd_session/pd_session.h

260
8.5 Session interfaces of the base API

Genode Pd_session assign_parent

Assign parent to protection domain pure virtual method

Argument

parent Capability<Parent>
Capability of parent interface

Genode Pd_session assign_pci

Assign PCI device to PD pure virtual method

Arguments

pci_config_memory_address addr_t

bdf uint16_t

Return value
bool

The specified address has to refer to the locally mapped PCI configuration space of the
device.
This function is solely used on the NOVA kernel.
Genode Pd_session map

Trigger eager insertion of page frames to page table within spec- pure virtual method
ified virtual range.
Arguments

virt addr_t
Virtual address within the address space to start
size addr_t
The virtual size of the region

Exceptions

Out_of_ram
Out_of_caps

If the used kernel don’t support this feature, the operation will silently ignore the re-
quest.

261
8.5 Session interfaces of the base API

Genode Pd_session alloc_signal_source

Create a new signal source pure virtual method

Exceptions

Out_of_ram
Out_of_caps

Return value
Signal_source_capability A cap that acts as reference to the created source

The signal source provides an interface to wait for incoming signals.


Genode Pd_session free_signal_source

Free a signal source pure virtual method

Argument

cap Signal_source_capability
Capability of the signal source to destroy

Genode Pd_session alloc_context

Allocate signal context pure virtual method

Arguments

source Signal_source_capability
Signal source that shall provide the new context
imprint unsigned long
Opaque value that gets delivered with signals originating from the allocated
signal-context capability

Exceptions

Out_of_ram
Out_of_caps
Invalid_signal_source

Return value
Capability<Signal_context> New signal-context capability

262
8.5 Session interfaces of the base API

Genode Pd_session free_context

Free signal-context pure virtual method

Argument

cap Capability<Signal_context>
Capability of signal-context to release

Genode Pd_session submit

Submit signals to the specified signal context pure virtual method

Arguments

context Capability<Signal_context>
Signal destination
cnt unsigned
Number of signals to submit at once
Default is 1

The context argument does not necessarily belong to this PD session. Normally, it is a
capability obtained from a potentially untrusted component. Because we cannot trust
this capability, signals are not submitted by invoking cap directly but by using it as
argument to our trusted PD-session interface. Otherwise, a potential signal receiver
could supply a capability with a blocking interface to compromise the nonblocking
behaviour of the signal submission.

263
8.5 Session interfaces of the base API

Genode Pd_session alloc_rpc_cap

Allocate new RPC-object capability pure virtual method

Argument

ep Native_capability
Entry point that will use this capability

Exceptions

Out_of_ram If meta-data backing store is exhausted


Out_of_caps If cap_quota is exceeded

Return value
Native_capability New RPC capability

Genode Pd_session free_rpc_cap

Free RPC-object capability pure virtual method

Argument

cap Native_capability
Capability to free

Genode Pd_session address_space


pure virtual method
Return value
Capability<Region_map> Region map of the PD’s virtual address space

Genode Pd_session stack_area


pure virtual method
Return value
Capability<Region_map> Region map of the PD’s stack area

Genode Pd_session linker_area


pure virtual method
Return value
Capability<Region_map> Region map of the PD’s linker area

264
8.5 Session interfaces of the base API

Genode Pd_session ref_account

Define reference account for the PD session pure virtual method


Argument

- Capability<Pd_session>

Exception

Invalid_session

Genode Pd_session transfer_quota

Transfer capability quota to another PD session pure virtual method

Arguments

to Capability<Pd_session>
Receiver of quota donation
amount Cap_quota
Amount of quota to donate

Exceptions

Out_of_caps
Invalid_session
Undefined_ref_account

Quota can only be transfered if the specified PD session is either the reference account
for this session or vice versa.
Genode Pd_session cap_quota
pure virtual const method
Return value
Cap_quota Current capability-quota limit

Genode Pd_session used_caps


pure virtual const method
Return value
Cap_quota Number of capabilities allocated from the session

Genode Pd_session avail_caps


const method
Return value
Cap_quota Amount of available capabilities

265
8.5 Session interfaces of the base API

Genode Pd_session transfer_quota

Transfer quota to another RAM session pure virtual method

Arguments

to Capability<Pd_session>
Receiver of quota donation
amount Ram_quota
Amount of quota to donate

Exceptions

Out_of_ram
Invalid_session
Undefined_ref_account

Quota can only be transfered if the specified PD session is either the reference account
for this session or vice versa.
Genode Pd_session ram_quota
pure virtual const method
Return value
Ram_quota Current quota limit

Genode Pd_session used_ram


pure virtual const method
Return value
Ram_quota Used quota

Genode Pd_session avail_ram


const method
Return value
Ram_quota Amount of available quota

Genode Pd_session native_pd


pure virtual method
Return value
Capability<Native_pd> Capability to kernel-specific PD operations

266
8.5 Session interfaces of the base API

Genode Pd_session system_control_cap


pure virtual method
Argument

- Location const

Return value
Capability<System_control>

Genode Pd_session dma_addr


pure virtual method
Argument

- Ram_dataspace_capability

Return value
addr_t DMA address, or 0 if the dataspace is invalid or the PD lacks the permission to
obtain the information

The intended use of this function is the use of RAM dataspaces as DMA buffers. On
systems without IOMMU, device drivers need to know the physical address of DMA
buffers for issuing DMA transfers.
Genode Pd_session attach_dma

Attach dataspace to I/O page table at specified address at pure virtual method

Arguments

- Dataspace_capability

at addr_t

Return value
Attach_dma_result

This operation is preserved to privileged system-management components like the


platform driver to assign DMA buffers to device protection domains. The attach can be
reverted by using address_space().detach().

267
8.5 Session interfaces of the base API

Genode Pd_connection
Connection to PD service class

Connection <Pd_session> Pd_session_client

Pd_connection

Pd_connection(. . .)
Pd_connection(. . .)

Header
repos/base/include/pd_session/connection.h

Genode Pd_connection Pd_connection


constructor
Arguments

env Env &

label Label const &


Default is Label()
space Virt_space
Default is CONSTRAIN

Genode Pd_connection Pd_connection

Constructor used for creating device protection domains constructor

Arguments

env Env &

- Device_pd

268
8.5 Session interfaces of the base API

Attached RAM dataspace The instantiation of an Attached_ram_dataspace object


subsumes the tasks of allocating a dataspaces from the component’s PD session and
attaching the dataspace to the component’s region map. Furthermore, the reverse op-
erations are performed during the destruction of an Attached_ram_dataspace object.

Genode Attached_ram_dataspace
Utility to allocate and locally attach a RAM dataspace class

Attached_ram_dataspace

Attached_ram_dataspace(. . .)
~Attached_ram_dataspace()
cap() : Ram_dataspace_capability
local_addr() : T *
size() : size_t
swap(. . .)
realloc(. . .)

Header
repos/base/include/base/attached_ram_dataspace.h

Genode Attached_ram_dataspace Attached_ram_dataspace


constructor
Arguments

ram Ram_allocator &

rm Region_map &

size size_t

cache Cache
Default is CACHED

Exceptions

Out_of_ram
Out_of_caps
Region_map::Region_conflict
Region_map::Invalid_dataspace

Genode Attached_ram_dataspace cap


const method
Return value
Ram_dataspace_capability Capability of the used RAM dataspace

269
8.5 Session interfaces of the base API

Genode Attached_ram_dataspace local_addr

Request local address const method template

Template argument

T typename

Return value
T *

This is a template to avoid inconvenient casts at the caller. A newly allocated RAM
dataspace is untyped memory anyway.
Genode Attached_ram_dataspace size
const method
Return value
size_t Size

Genode Attached_ram_dataspace swap


method
Argument

other Attached_ram_dataspace &

Genode Attached_ram_dataspace realloc

Re-allocate dataspace with a new size method

Arguments

ram_allocator Ram_allocator *

new_size size_t

The content of the original dataspace is not retained.

270
8.5 Session interfaces of the base API

8.5.2. ROM session interface

The ROM service (Section 4.5.1) provides access to ROM modules, e. g., binary data
loaded by the boot loader (core’s ROM service described in Section 3.4.3). Each session
refers to one ROM module. The module’s data is provided to the client in the form of a
dataspace (Section 3.4.1).

Genode
namespace
ROM session interface
Types

Rom_dataspace is a subtype of Dataspace


Rom_dataspace_capability is defined as Capability<Rom_dataspace>

Header
repos/base/include/rom_session/rom_session.h

Genode Rom_session
class
Session

Rom_session

~Rom_session()
RPC interface
dataspace() : Rom_dataspace_capability
update() : bool
sigh(. . .)

Header
repos/base/include/rom_session/rom_session.h

Genode Rom_session dataspace

Request dataspace containing the ROM session data pure virtual method

Return value
Rom_dataspace_capability Capability to ROM dataspace

The capability may be invalid.


Consecutive calls of this method are not guaranteed to return the same dataspace as
dynamic ROM sessions may update the ROM data during the lifetime of the session.

271
8.5 Session interfaces of the base API

When calling the method, the server may destroy the old dataspace and replace it with
a new one containing the updated data. Hence, prior calling this method, the client
should make sure to detach the previously requested dataspace from its local address
space.
Genode Rom_session update

Update ROM dataspace content virtual method

Return value
bool True if the existing dataspace contains up-to-date content, or false if a new datas-
pace must be requested via the dataspace method

This method is an optimization for use cases where ROM dataspaces are updated at a
high rate. In such cases, requesting a new dataspace for each update induces a large
overhead because memory mappings must be revoked and updated (e. g., handling
the page faults referring to the dataspace). If the updated content fits in the existing
dataspace, those costly operations can be omitted.
When this method is called, the server may replace the dataspace content with new
data.
Genode Rom_session sigh

Register signal handler to be notified of ROM data changes pure virtual method

Argument

sigh Signal_context_capability

The ROM session interface allows for the implementation of ROM services that dynam-
ically update the data exported as ROM dataspace during the lifetime of the session.
This is useful in scenarios where this data is generated rather than originating from a
static file, for example to update a program’s configuration at runtime.
By installing a signal handler using the sigh() method, the client will receive a no-
tification each time the data changes at the server. From the client’s perspective, the
original data contained in the currently used dataspace remains unchanged until the
client calls dataspace() the next time.

272
8.5 Session interfaces of the base API

Genode Rom_connection
Connection to ROM file service class

Connection <Rom_session> Rom_session_client

Rom_connection

Header
repos/base/include/rom_session/connection.h

Attached ROM dataspace By instantiating an Attached_rom_dataspace object, a


ROM module is requested and made visible within the component’s address space in a
single step.
To accommodate the common use of a ROM session as provider of configuration
data (Section 4.6) or an XML-formatted data model in a publisher-subscriber scenario
4.7.5, the Attached_rom_dataspace provides a convenient way to retrieve its content
as an XML node via the xml method. The method always returns a valid Xml_node
even in the event where the dataspace is invalid or contains no XML. In such cases, the
returned XML node is <empty/>. This relieves the caller from handling exceptions that
may occur during XML parsing.

273
8.5 Session interfaces of the base API

Genode Attached_rom_dataspace
Utility to open a ROM session and locally attach its content class

Attached_rom_dataspace

Attached_rom_dataspace(. . .)
cap() : Dataspace_capability
local_addr() : T *
local_addr() : T const *
size() : size_t
sigh(. . .)
update()
valid() : bool
xml() : Xml_node

Accessor
size size_t

Header
repos/base/include/base/attached_rom_dataspace.h

Genode Attached_rom_dataspace Attached_rom_dataspace


constructor
Arguments

env Env &

name char const *

Exceptions

Rom_connection::Rom_connection_failed
Region_map::Region_conflict
Region_map::Invalid_dataspace
Out_of_ram
Out_of_caps

Genode Attached_rom_dataspace cap


const method
Return value
Dataspace_capability Capability of the used dataspace

274
8.5 Session interfaces of the base API

Genode Attached_rom_dataspace local_addr


method template
Template argument

T typename

Return value
T *

Genode Attached_rom_dataspace local_addr


const method template
Template argument

T typename

Return value
T const *

Genode Attached_rom_dataspace sigh

Register signal handler for ROM module changes method

Argument

sigh Signal_context_capability

Genode Attached_rom_dataspace update

Update ROM module content, re-attach if needed method

Genode Attached_rom_dataspace valid


const method
Return value
bool True of content is present

Genode Attached_rom_dataspace xml


const method
Return value
Xml_node Dataspace content as XML node

This method always returns a valid XML node. It never throws an exception. If the
dataspace is invalid or does not contain properly formatted XML, the returned XML
node has the form "<empty/>".

275
8.5 Session interfaces of the base API

8.5.3. RM session interface

The region-map (RM) service (Section 3.4.5) allows for the manual creation of region
maps that represent (portions of) virtual address spaces. Components can use this ser-
vice to manage the virtual memory layout of so-called managed dataspaces manually.
For example, it allows a dataspace provider to populate the content of dataspaces on
demand, depending on the access patterns produced by the dataspace user.
Note that the RM service is not available on all base platforms. In particular, it is not
supported on Linux.

Genode Rm_session
Region-map session interface class

Session

Rm_session

RPC interface
create(. . .) : Capability<Region_map>
destroy(. . .)

Header
repos/base/include/rm_session/rm_session.h

276
8.5 Session interfaces of the base API

Genode Rm_session create

Create region map pure virtual method

Argument

size size_t
Upper bound of region map

Exceptions

Out_of_ram
Out_of_caps

Return value
Capability<Region_map> Region-map capability

Genode Rm_session destroy

Destroy region map pure virtual method

Argument

- Capability<Region_map>

Genode Rm_connection
Connection to RM service class

Connection <Rm_session> Rm_session_client

Rm_connection

Rm_connection(. . .)
create(. . .) : Capability<Region_map>

Header
repos/base/include/rm_session/connection.h

277
8.5 Session interfaces of the base API

Genode Rm_connection Rm_connection


constructor
Argument

env Env &

Genode Rm_connection create

Wrapper over create that handles resource requests from the method
server
Argument

size size_t

Return value
Capability<Region_map>

278
8.5 Session interfaces of the base API

8.5.4. CPU session interface

The CPU service (Section 3.4.6) provides a facility for creating and managing threads.
A CPU session corresponds to a CPU-time allocator, from which multiple threads can
be allocated.
Genode Cpu_session
class
Session

Cpu_session

~Cpu_session()
create_thread(. . .) : Thread_capability
kill_thread(. . .)
exception_sigh(. . .)
affinity_space() : Affinity::Space
scale_priority(. . .) : unsigned RPC interface
trace_control() : Dataspace_capability
ref_account(. . .) : int
transfer_quota(. . .) : int
quota() : Quota
quota_lim_upscale(. . .) : size_t
quota_lim_downscale(. . .) : size_t
native_cpu() : Capability<Native_cpu>

Header
repos/base/include/cpu_session/cpu_session.h

279
8.5 Session interfaces of the base API

Genode Cpu_session create_thread

Create a new thread pure virtual method


Arguments

pd Capability<Pd_session>
Protection domain where the thread will be executed
name Name const &
Name for the thread
affinity Location
CPU affinity, referring to the session-local affinity space
weight Weight
CPU quota that shall be granted to the thread
utcb addr_t
Base of the UTCB that will be used by the thread
Default is 0

Exceptions

Thread_creation_failed
Out_of_ram
Out_of_caps

Return value
Thread_capability Capability representing the new thread

Genode Cpu_session kill_thread

Kill an existing thread pure virtual method

Argument

thread Thread_capability
Capability of the thread to kill

Genode Cpu_session exception_sigh

Register default signal handler for exceptions pure virtual method

Argument

- Signal_context_capability

This handler is used for all threads that have no explicitly installed exception handler.
On Linux, this exception is delivered when the process triggers a SIGCHLD. On other
platforms, this exception is delivered on the occurrence of CPU exceptions such as
division by zero.

280
8.5 Session interfaces of the base API

Genode Cpu_session affinity_space


pure virtual const method
Return value
Affinity::Space Affinity space of CPU nodes available to the CPU session

The dimension of the affinity space as returned by this method represent the physical
CPUs that are available.
Genode Cpu_session scale_priority

Translate generic priority value to kernel-specific priority levels class function

Arguments

pf_prio_limit unsigned
Maximum priority used for the kernel, must be power of 2
prio unsigned
Generic priority value as used by the CPU session interface
inverse bool
Order of platform priorities, if true pf_prio_limit corresponds to the
highest priority, otherwise it refers to the lowest priority.
Default is true

Return value
unsigned Platform-specific priority value

Genode Cpu_session trace_control

Request trace control dataspace pure virtual method

Return value
Dataspace_capability

The trace-control dataspace is used to propagate tracing control information from core
to the threads of a CPU session.
The trace-control dataspace is accounted to the CPU session.

281
8.5 Session interfaces of the base API

Genode Cpu_session ref_account

Define reference account for the CPU session pure virtual method
Argument

cpu_session Cpu_session_capability
Reference account

Return value
int 0 on success

Each CPU session requires another CPU session as reference account to transfer quota
to and from. The reference account can be defined only once.
Genode Cpu_session transfer_quota

Transfer quota to another CPU session pure virtual method

Arguments

cpu_session Cpu_session_capability
Receiver of quota donation
amount size_t
Percentage of the session quota scaled up to the QUOTA_LIMIT space

Return value
int 0 on success

Quota can only be transfered if the specified CPU session is either the reference account
for this session or vice versa.

282
8.5 Session interfaces of the base API

Genode Cpu_session quota


pure virtual method
Return value
Quota Quota configuration of the session

Genode Cpu_session quota_lim_upscale

Scale up value from its space with limit to the QUOTA_LIMIT class function template
space
Template argument

T typename
Default is size_t

Arguments

value size_t const

limit size_t const

Return value
size_t

Genode Cpu_session quota_lim_downscale

Scale down value from the QUOTA_LIMIT space to a space with class function template
limit
Template argument

T typename
Default is size_t

Arguments

value size_t const

limit size_t const

Return value
size_t

Genode Cpu_session native_cpu


pure virtual method
Return value
Capability<Native_cpu> Capability to kernel-specific CPU operations

283
8.5 Session interfaces of the base API

Genode Cpu_connection
Connection to CPU service class

Connection <Cpu_session> Cpu_session_client

Cpu_connection

Cpu_connection(. . .)
create_thread(. . .) : Thread_capability

Header
repos/base/include/cpu_session/connection.h

Genode Cpu_connection Cpu_connection


constructor
Arguments

env Env &

label Label const &


Default is Label()
priority long
Designated priority of all threads created with this CPU session
Default is DEFAULT_PRIORITY
affinity Affinity const &
Default is Affinity()

284
8.5 Session interfaces of the base API

Genode Affinity
Affinity to CPU nodes class

Affinity

Affinity(. . .)
Affinity()
space() : Space
location() : Location
valid() : bool
from_xml(. . .) : Affinity
unrestricted() : Affinity
scale_to(. . .) : Location

Accessors
space Space
location Location
valid bool

Header
repos/base/include/base/affinity.h

The entity of CPU nodes is expected to form a grid where the Euclidean distance
between nodes roughly correlate to the locality of their respective resources. Closely
interacting processes are supposed to perform best when using nodes close to each
other. To allow a relatively simple specification of such constraints, the affinity of a
subsystem (e. g., a process) to CPU nodes is expressed as a rectangle within the grid of
available CPU nodes. The dimensions of the grid are represented by Affinity::Space.
The rectangle within the grid is represented by Affinity::Location.

285
8.5 Session interfaces of the base API

Genode Affinity Affinity


constructor
Arguments

space Space const &

location Location const &

Genode Affinity from_xml


class function
Argument

node Xml_node const &

Return value
Affinity

Genode Affinity unrestricted


class function
Return value
Affinity

Genode Affinity scale_to


const method
Argument

space Space const &

Return value
Location Location scaled to specified affinity space

286
8.5 Session interfaces of the base API

Once created, a thread is referred to via a thread capability. This capability allows for
the destruction of the thread via the CPU session, and provides the Cpu_thread RPC
interface to operate on the thread.

Genode Cpu_thread
CPU thread interface class

Interface

Cpu_thread

utcb() : Dataspace_capability
start(. . .)
pause()
resume()
state() : Thread_state
RPC interface
state(. . .)
exception_sigh(. . .)
single_step(. . .)
affinity(. . .)
trace_control_index() : unsigned
trace_buffer() : Dataspace_capability
trace_policy() : Dataspace_capability

Header
repos/base/include/cpu_thread/cpu_thread.h

287
8.5 Session interfaces of the base API

Genode Cpu_thread utcb

Get dataspace of the thread’s user-level thread-control block pure virtual method
(UTCB)
Return value
Dataspace_capability

Genode Cpu_thread start

Modify instruction and stack pointer of thread - start the thread pure virtual method

Arguments

ip addr_t
Initial instruction pointer
sp addr_t
Initial stack pointer

Genode Cpu_thread pause

Pause the thread pure virtual method

After calling this method, the execution of the thread can be continued by calling
resume.
Genode Cpu_thread resume

Resume the thread pure virtual method

Genode Cpu_thread state

Get the current thread state pure virtual method


Exception

State_access_failed

Return value
Thread_state State of the targeted thread

288
8.5 Session interfaces of the base API

Genode Cpu_thread state

Override the current thread state pure virtual method


Argument

state Thread_state const &


State that shall be applied

Exception

State_access_failed

Genode Cpu_thread exception_sigh

Register signal handler for exceptions of the thread pure virtual method

Argument

handler Signal_context_capability

On Linux, this exception is delivered when the process triggers a SIGCHLD. On other
platforms, this exception is delivered on the occurrence of CPU exceptions such as
division by zero.
Genode Cpu_thread single_step

Enable/disable single stepping pure virtual method

Argument

enabled bool
True = enable single-step mode; false = disable

Since this method is currently supported by a small number of platforms, we provide


a default implementation
Genode Cpu_thread affinity

Define affinity of thread to one or multiple CPU nodes pure virtual method

Argument

location Location
Location within the affinity space of the thread’s CPU session

In the normal case, a thread is assigned to a single CPU. Specifying more than one CPU
node is supposed to principally allow a CPU service to balance the load of threads
among multiple CPUs.

289
8.5 Session interfaces of the base API

Genode Cpu_thread trace_control_index

Request index within the trace control block of the thread’s CPU pure virtual method
session
Return value
unsigned

The trace control dataspace contains the control blocks for all threads of the CPU ses-
sion. Each thread gets assigned a different index by the CPU service.
Genode Cpu_thread trace_buffer

Request trace buffer for the thread pure virtual method

Return value
Dataspace_capability

The trace buffer is not accounted to the CPU session. It is owned by a TRACE session.
Genode Cpu_thread trace_policy

Request trace policy pure virtual method

Return value
Dataspace_capability

The trace policy buffer is not accounted to the CPU session. It is owned by a TRACE
session.

290
8.5 Session interfaces of the base API

8.5.5. IO_MEM session interface

The IO_MEM service (Section 3.4.7) enables user-level device drivers to obtain memory-
mapped device resources in the form of dataspaces. Each IO_MEM session corresponds
to the reservation of a physical address range, for which a dataspace is provided to the
client. The user-level device driver can make the device resource visible in its address
space by attaching the dataspace to the region map of its own PD session.

Genode
namespace
Memory-mapped I/O session interface
Types

Io_mem_dataspace is a subtype of Dataspace


Io_mem_dataspace_capability is defined as Capability<Io_mem_dataspace>

Header
repos/base/include/io_mem_session/io_mem_session.h

Genode Io_mem_session
class
Session

Io_mem_session

RPC interface
~Io_mem_session()
dataspace() : Io_mem_dataspace_capability

Header
repos/base/include/io_mem_session/io_mem_session.h

Genode Io_mem_session dataspace

Request dataspace containing the IO_MEM session data pure virtual method

Return value
Io_mem_dataspace_capability Capability to IO_MEM dataspace (may be invalid)

291
8.5 Session interfaces of the base API

Genode Io_mem_connection
Connection to I/O-memory service class

Connection <Io_mem_session> Io_mem_session_client

Io_mem_connection

Io_mem_connection(. . .)

Header
repos/base/include/io_mem_session/connection.h

Genode Io_mem_connection Io_mem_connection


constructor
Arguments

env Env &

base addr_t
Physical base address of memory-mapped I/O resource
size size_t
Size memory-mapped I/O resource
write_combined bool
Enable write-combined access to I/O memory
Default is false

Attached IO_MEM dataspace An instance of an Attached_io_mem_dataspace rep-


resents a locally mapped memory-mapped I/O range.

292
8.5 Session interfaces of the base API

Genode Attached_io_mem_dataspace
Request and locally attach a memory-mapped I/O resource class

Attached_io_mem_dataspace

Attached_io_mem_dataspace(. . .)
~Attached_io_mem_dataspace()
cap() : Io_mem_dataspace_capability
local_addr() : T *

Header
repos/base/include/base/attached_io_mem_dataspace.h

This class is a wrapper for a typical sequence of operations performed by device


drivers to access memory-mapped device resources. Its sole purpose is to avoid dupli-
cated code.

293
8.5 Session interfaces of the base API

Genode Attached_io_mem_dataspace Attached_io_mem_dataspace


constructor
Arguments

env Env &

base addr_t
Base address of memory-mapped I/O resource
size size_t
Size of resource
write_combined bool
Enable write combining for the resource
Default is false

Exceptions

Service_denied
Insufficient_ram_quota
Insufficient_cap_quota
Out_of_ram
Out_of_caps
Region_map::Region_conflict
Region_map::Invalid_dataspace

Genode Attached_io_mem_dataspace cap


method
Return value
Io_mem_dataspace_capability Capability of the used RAM dataspace

Genode Attached_io_mem_dataspace local_addr

Request local address method template

Template argument

T typename

Return value
T *

This is a template to avoid inconvenient casts at the caller. A newly allocated I/O MEM
dataspace is untyped memory anyway.

294
8.5 Session interfaces of the base API

8.5.6. IO_PORT session interface

On the x86 architecture, the IO_PORT service (Section 3.4.7) provides access to device
I/O ports via an RPC interface. Each IO_PORT session corresponds to the access right
to a port range.

Genode Io_port_session
I/O-port session interface class

Session

Io_port_session

~Io_port_session()
inb(. . .) : unsigned char
inw(. . .) : unsigned short RPC interface
inl(. . .) : unsigned
outb(. . .)
outw(. . .)
outl(. . .)

Header
repos/base/include/io_port_session/io_port_session.h

295
8.5 Session interfaces of the base API

Genode Io_port_session inb

Read byte (8 bit) pure virtual method

Argument

address unsigned short


Physical I/O port address

Return value
unsigned char Value read from port

Genode Io_port_session inw

Read word (16 bit) pure virtual method

Argument

address unsigned short


Physical I/O port address

Return value
unsigned short Value read from port

Genode Io_port_session inl

Read double word (32 bit) pure virtual method

Argument

address unsigned short


Physical I/O port address

Return value
unsigned Value read from port

Genode Io_port_session outb

Write byte (8 bit) pure virtual method

Arguments

address unsigned short


Physical I/O port address
value unsigned char
Value to write to port

296
8.5 Session interfaces of the base API

Genode Io_port_session outw

Write word (16 bit) pure virtual method

Arguments

address unsigned short


Physical I/O port address
value unsigned short
Value to write to port

Genode Io_port_session outl

Write double word (32 bit) pure virtual method

Arguments

address unsigned short


Physical I/O port address
value unsigned
Value to write to port

Genode Io_port_connection
Connection to I/O-port service class

Connection <Io_port_session> Io_port_session_client

Io_port_connection

Io_port_connection(. . .)

Header
repos/base/include/io_port_session/connection.h

297
8.5 Session interfaces of the base API

Genode Io_port_connection Io_port_connection


constructor
Arguments

env Env &

base unsigned
Base address of port range
size unsigned
Size of port range

298
8.5 Session interfaces of the base API

8.5.7. IRQ session interface

The IRQ service (Section 3.4.7) enables user-level device drivers to serve device inter-
rupts. Each IRQ session corresponds to an associated interrupt line.

Genode Irq_connection
Connection to IRQ service class

Connection <Irq_session> Irq_session_client

Irq_connection

Irq_connection(. . .)
Irq_connection(. . .)

Header
repos/base/include/irq_session/connection.h

Genode Irq_connection Irq_connection


constructor
Arguments

env Env &

label Label const &


Physical interrupt number
- TRIGGER_UNCHANGED

- POLARITY_UNCHANGED

Genode Irq_connection Irq_connection


constructor
Arguments

env Env &

label Label const &


(virtual) interrupt number
device_config_phys addr_t
Config-space physical address
- TYPE_MSI

299
8.5 Session interfaces of the base API

8.5.8. LOG session interface

For low-level debugging, core provides a simple LOG service (Section 3.4.8), which
enables clients to print textual messages. In the LOG output, each message is tagged
with the label of the corresponding client.

Genode Log_session
Log text output session interface class

Session

Log_session

RPC interface
~Log_session()
write(. . .)

Header
repos/base/include/log_session/log_session.h

Genode Log_session write

Output null-terminated string pure virtual method

Argument

string String const &

Genode Log_connection
Connection to LOG service class

Connection <Log_session> Log_session_client

Log_connection

Log_connection(. . .)

Header
repos/base/include/log_session/connection.h

300
8.5 Session interfaces of the base API

Genode Log_connection Log_connection


constructor
Arguments

env Env &

label Session_label const &


Default is Session_label()

301
8.6 OS-level session interfaces

8.6. OS-level session interfaces

8.6.1. Report session interface

Report sessions (Section 4.5.2) allow a component to propagate internal state to the
outside. The most prominent use case is the realization of publisher-subscriber com-
munication schemes as discussed in Section 4.7.5.
Report Session
Report session interface class

Session

Report::Session

dataspace() : Dataspace_capability
RPC interface
submit(. . .)
response_sigh(. . .)
obtain_response() : size_t

Header
repos/os/include/report_session/report_session.h

Report Session dataspace

Request the dataspace used to carry reports and responses pure virtual method

Return value
Dataspace_capability

Report Session submit

Submit data that is currently contained in the dataspace as report pure virtual method

Argument

length size_t
Length of report in bytes

While this method is called, the information in the dataspace must not be modified by
the client.

302
8.6 OS-level session interfaces

Report Session response_sigh

Install signal handler for response notifications pure virtual method

Argument

- Signal_context_capability

Report Session obtain_response

Request a response from the recipient of reports pure virtual method

Return value
size_t Length of response in bytes

By calling this method, the client expects that the server will replace the content of the
dataspace with new information.

Report Connection
Connection to Report service class

Connection <Session> Session_client

Report::Connection

Connection(. . .)

Header
repos/os/include/report_session/connection.h

Report Connection Connection


constructor
Arguments

env Env &

label Label const &

buffer_size size_t
Default is 4096

303
8.6 OS-level session interfaces

The client-side Reporter is a convenient front end for the use of a report connection
to propagate XML-formed data.

Genode Reporter
class
Reporter

Reporter(. . .)
enabled(. . .)
enabled() : bool
name() : Name
clear()
report(. . .)

Accessor
name Name

Header
repos/os/include/os/reporter.h

Genode Reporter Reporter


constructor
Arguments

env Env &

xml_name char const *

label char const *


Default is nullptr
buffer_size size_t
Default is 4096

Genode Reporter enabled

Enable or disable reporting method

Argument

enabled bool

Genode Reporter enabled


const method
Return value
bool True if reporter is enabled

304
8.6 OS-level session interfaces

Genode Reporter clear

Clear report buffer method

Genode Reporter report

Report data buffer method

Arguments

data void const *


Data buffer
length size_t
Number of bytes to report

The Expanding_reporter further streamlines the generation of reports by eliminat-


ing the burden of handling Buffer_exceeded exceptions as thrown by the Xml_generator
from components that generate reports. Such exceptions are easy to miss because re-
ports are often small at testing time but become larger in complex scenarios. Whenever
the report exceeds the current buffer size, the expanding reporter automatically up-
grades the report session as needed. Note that such an upgrade consumes RAM
quota. For components that strictly account RAM consumption to clients, the regular
Reporter is preferable. However, in most cases - where reports are not accounted per
client but to the component itself - the Expanding_reporter should better be used.
Besides the builtin support for growing the report buffer, the expanding reporter alle-
viates the need to explicitly enable reports. In contrast to the Reporter, it is implicitly
enabled at construction time.
Genode Expanding_reporter
Reporter that increases the report buffer capacity on demand class

Expanding_reporter

Expanding_reporter(. . .)
generate(. . .)
generate(. . .)

Header
repos/os/include/os/reporter.h

This convenience wrapper of the Reporter alleviates the need to handle Xml_generator::Buffer_exce
exceptions manually. In most cases, the only reasonable way to handle such an excep-

305
8.6 OS-level session interfaces

tion is upgrading the report buffer as done by this class. Furthermore, in contrast to the
regular Reporter, which needs to be enabled, the Expanding_reporter is implicitly
enabled at construction time.
Genode Expanding_reporter Expanding_reporter
constructor
Arguments

env Env &

type Node_type const &

label Label const &

Genode Expanding_reporter generate


method
Argument

fn auto const &

Genode Expanding_reporter generate


method
Argument

node Xml_node

306
8.6 OS-level session interfaces

8.6.2. Terminal and UART session interfaces

A terminal session (Section 4.5.3) is a bi-directional communication channel. The


UART session interface supplements the terminal session interface with a facility to
parametrize UART configuration parameters

Terminal Session
Terminal session interface class

Session

Terminal::Session

size() : Size
avail() : bool
read(. . .) : Genode::size_t RPC interface
write(. . .) : Genode::size_t
connected_sigh(. . .)
read_avail_sigh(. . .)
size_changed_sigh(. . .)

Header
repos/os/include/terminal_session/terminal_session.h

Terminal Session size


pure virtual method
Return value
Size Terminal size

Terminal Session avail


pure virtual method
Return value
bool True of one or more characters are available for reading

307
8.6 OS-level session interfaces

Terminal Session read

Read characters from terminal pure virtual method


Arguments

buf void *

buf_size size_t

Return value
Genode::size_t

Terminal Session write

Write characters to terminal pure virtual method


Arguments

buf void const *

num_bytes size_t

Return value
Genode::size_t

Terminal Session connected_sigh

Register signal handler to be informed about the established con- pure virtual method
nection
Argument

cap Signal_context_capability

This method is used for a simple startup protocol of terminal sessions. At session-
creation time, the terminal session may not be ready to use. For example, a TCP
terminal session needs an established TCP connection first. However, we do not want
to let the session-creation block on the server side because this would render the servers
entrypoint unavailable for all other clients until the TCP connection is ready. Instead,
we deliver a connected signal to the client emitted when the session becomes ready to
use. The Terminal::Connection waits for this signal at construction time.

308
8.6 OS-level session interfaces

Terminal Session read_avail_sigh

Register signal handler to be informed about ready-to-read char- pure virtual method
acters
Argument

cap Signal_context_capability

Terminal Session size_changed_sigh

Register signal handler to be notified on terminal-size changes pure virtual method

Argument

cap Signal_context_capability

Terminal Connection
Connection to Terminal service class

Connection <Session> Session_client

Terminal::Connection

Connection(. . .)

Header
repos/os/include/terminal_session/connection.h

Terminal Connection Connection


constructor
Arguments

env Env &

label Label const &


Default is Label()

309
8.6 OS-level session interfaces

Uart Session
UART session interface class

Terminal::Session

Uart::Session
RPC interface
baud_rate(. . .)

Header
repos/os/include/uart_session/uart_session.h

Uart Session baud_rate

Set baud rate pure virtual method


Argument

bits_per_second size_t

Uart Connection
Connection to UART service class

Connection <Session> Session_client

Uart::Connection

Connection(. . .)

Header
repos/os/include/uart_session/connection.h

Uart Connection Connection


constructor
Argument

env Env &

310
8.6 OS-level session interfaces

8.6.3. Event session interface

An event session (Section 4.5.4) represents a stream of user-input events from client to
server.
Event Session
Event session interface class

Session

Event::Session RPC interface

Header
repos/os/include/event_session/event_session.h

Event Connection
Connection to event service class

Connection <Session> Session_client

Event::Connection

Connection(. . .)

Header
repos/os/include/event_session/connection.h

Event Connection Connection


constructor
Arguments

env Env &

label Label const &


Default is Label()

311
8.6 OS-level session interfaces

8.6.4. Capture session interface

Framebuffer Mode
Framebuffer mode info as returned by Framebuffer::Session::mode() class

Framebuffer::Mode

bytes_per_pixel() : Genode::size_t
print(. . .)

Accessor
bytes_per_pixel Genode::size_t

Header
repos/os/include/framebuffer_session/framebuffer_session.h

Framebuffer Mode print


const method
Argument

out Output &

Capture Session
Capture session interface class

Session

Capture::Session

buffer_bytes(. . .) : size_t
screen_size() : Area
RPC interface
screen_size_sigh(. . .)
buffer(. . .)
dataspace() : Dataspace_capability
capture_at(. . .) : Affected_rects

Header
repos/os/include/capture_session/capture_session.h

312
8.6 OS-level session interfaces

Capture Session buffer_bytes


class function
Argument

size Area

Return value
size_t Number of bytes needed for pixel buffer of specified size

Capture Session screen_size

Request current screen size pure virtual const method

Return value
Area

Capture Session screen_size_sigh

Register signal handler to be notified whenever the screen size pure virtual method
changes
Argument

- Signal_context_capability

Capture Session buffer

Define dimensions of the shared pixel buffer pure virtual method

Argument

size Area

Exceptions

Out_of_ram Session quota does not suffice for specified buffer dimensions
Out_of_caps

The size controls the server-side allocation of the shared pixel buffer and may affect
the screen size of the GUI server. The screen size is bounding box of the pixel buffers
of all capture clients.

313
8.6 OS-level session interfaces

Capture Session dataspace

Request dataspace of the shared pixel buffer defined via buffer pure virtual method

Return value
Dataspace_capability

Capture Session capture_at

Update the pixel-buffer with content at the specified screen posi- pure virtual method
tion
Argument

- Point

Return value
Affected_rects Geometry information about the content that changed since the pre-
vious call of capture_at

Capture Connection
class
Connection <Session> Session_client

Capture::Connection

Connection(. . .)
buffer(. . .)

Header
repos/os/include/capture_session/connection.h

Capture Connection Connection


constructor
Arguments

env Env &

label Label const &


Default is Label()

314
8.6 OS-level session interfaces

8.6.5. GUI session interface

The GUI session described in Section 4.5.6 comprises a virtual framebuffer, a stream of
user-input events, and a so-called view stack, which defines how portions of the virtual
framebuffer are composed on screen.

Gui Session
GUI session interface class

Session

Gui::Session

~Session()
framebuffer_session() : Framebuffer::Session_capability
input_session() : Input::Session_capability
create_view(. . .) : View_handle
destroy_view(. . .)
view_handle(. . .) : View_handle
view_capability(. . .) : View_capability
RPC interface
release_view_handle(. . .)
command_dataspace() : Genode::Dataspace_capability
execute()
mode() : Framebuffer::Mode
mode_sigh(. . .)
buffer(. . .)
focus(. . .)
session_control(. . .)
ram_quota(. . .) : size_t

Header
repos/os/include/gui_session/gui_session.h

315
8.6 OS-level session interfaces

Gui Session framebuffer_session

Request framebuffer sub-session pure virtual method

Return value
Framebuffer::Session_capability

Gui Session input_session

Request input sub-session pure virtual method

Return value
Input::Session_capability

Gui Session create_view

Create a new view at the buffer pure virtual method


Argument

parent View_handle
Parent view
Default is View_handle()

Exception

Invalid_handle

Return value
View_handle Handle for new view

The parent argument allows the client to use the location of an existing view as the
coordinate origin for the to-be-created view. If an invalid handle is specified (default),
the view will be a top-level view.

316
8.6 OS-level session interfaces

Gui Session destroy_view

Destroy view pure virtual method

Argument

- View_handle

Gui Session view_handle


pure virtual method
Arguments

- View_capability

handle View_handle
Designated view handle to be assigned to the imported view. By default, a new
handle will be allocated.
Default is View_handle()

Exceptions

Out_of_ram
Out_of_caps

Return value
View_handle Session-local handle for the specified view

The handle returned by this method can be used to issue commands via the execute
method.
Gui Session view_capability

Request view capability for a given handle pure virtual method

Argument

- View_handle

Return value
View_capability

The returned view capability can be used to share the view with another session.

317
8.6 OS-level session interfaces

Gui Session release_view_handle

Release session-local view handle pure virtual method


Argument

- View_handle

Gui Session command_dataspace

Request dataspace used for issuing view commands to nitpicker pure virtual method

Return value
Genode::Dataspace_capability

Gui Session execute

Execution batch of commands contained in the command datas- pure virtual method
pace

Gui Session mode


pure virtual method
Return value
Framebuffer::Mode Physical screen mode

Gui Session mode_sigh

Register signal handler to be notified about mode changes pure virtual method

Argument

- Signal_context_capability

Gui Session buffer

Define dimensions of virtual framebuffer pure virtual method


Arguments

mode Mode

use_alpha bool

Exceptions

Out_of_ram Session quota does not suffice for specified buffer dimensions
Out_of_caps

318
8.6 OS-level session interfaces

Gui Session focus

Set focused session pure virtual method


Argument

focused Capability<Session>

Normally, the focused session is defined by the focus ROM, which is driven by an ex-
ternal policy component. However, in cases where one application consists of multiple
nitpicker sessions, it is desirable to let the application manage the focus among its ses-
sions by applying an application-local policy. The focus RPC function allows a client
to specify another client that should receive the focus whenever the session becomes
focused. As the designated receiver of the focus is referred to by its session capability,
a common parent can manage the focus among its children. But unrelated sessions
cannot interfere.
Gui Session session_control

Perform control operation on one or multiple sessions virtual method

Arguments

- Label

- Session_control

The label is used to select the sessions, on which the operation is performed. The
GUI server creates a selector string by concatenating the caller’s session label with the
supplied label argument. A session is selected if its label starts with the selector string.
Thereby, the operation is limited to the caller session or any child session of the caller.
Gui Session ram_quota
class function
Arguments

mode Mode

use_alpha bool

Return value
size_t Number of bytes needed for virtual framebuffer of specified size

319
8.6 OS-level session interfaces

Gui Connection
Connection to GUI service class

Connection <Session> Session_client

Gui::Connection

Header
repos/os/include/gui_session/connection.h

Framebuffer Session
class
Session

Framebuffer::Session

~Session()
dataspace() : Genode::Dataspace_capability
RPC interface
mode() : Mode
mode_sigh(. . .)
refresh(. . .)
sync_sigh(. . .)

Header
repos/os/include/framebuffer_session/framebuffer_session.h

Framebuffer Session dataspace

Request dataspace representing the logical frame buffer pure virtual method

Return value
Genode::Dataspace_capability

By calling this method, the framebuffer client enables the server to reallocate the frame-
buffer dataspace (e. g., on mode changes). Hence, prior calling this method, the client
should make sure to have detached the previously requested dataspace from its local
address space.

320
8.6 OS-level session interfaces

Framebuffer Session mode


pure virtual const method
Request display-mode properties of the framebuffer ready to be
obtained via the dataspace() method
Return value
Mode

Framebuffer Session mode_sigh

Register signal handler to be notified on mode changes pure virtual method

Argument

sigh Signal_context_capability

The framebuffer server may support changing the display mode on the fly. For exam-
ple, a virtual framebuffer presented in a window may get resized according to the
window dimensions. By installing a signal handler for mode changes, the framebuffer
client can respond to such changes. The new mode can be obtained using the mode()
method. However, from the client’s perspective, the original mode stays in effect until
the it calls dataspace() again.
Framebuffer Session refresh

Flush specified pixel region pure virtual method

Arguments

x int

y int

w int

h int

Framebuffer Session sync_sigh

Register signal handler for refresh synchronization pure virtual method

Argument

- Signal_context_capability

321
8.6 OS-level session interfaces

Input Session
Input session interface class

Session

Input::Session

~Session()
dataspace() : Genode::Dataspace_capability RPC interface
pending() : bool
flush() : int
sigh(. . .)

Header
repos/os/include/input_session/input_session.h

Input Session dataspace


pure virtual method
Return value
Genode::Dataspace_capability Capability to event buffer dataspace

Input Session pending

Request input state pure virtual const method

Return value
bool True if new events are available

Input Session flush

Flush pending events to event buffer pure virtual method

Return value
int Number of flushed events

Input Session sigh

Register signal handler to be notified on arrival of new input pure virtual method

Argument

- Signal_context_capability

322
8.6 OS-level session interfaces

8.6.6. Capture session interface

Framebuffer Mode
Framebuffer mode info as returned by Framebuffer::Session::mode() class

Framebuffer::Mode

bytes_per_pixel() : Genode::size_t
print(. . .)

Accessor
bytes_per_pixel Genode::size_t

Header
repos/os/include/framebuffer_session/framebuffer_session.h

Framebuffer Mode print


const method
Argument

out Output &

Capture Session
Capture session interface class

Session

Capture::Session

buffer_bytes(. . .) : size_t
screen_size() : Area
RPC interface
screen_size_sigh(. . .)
buffer(. . .)
dataspace() : Dataspace_capability
capture_at(. . .) : Affected_rects

Header
repos/os/include/capture_session/capture_session.h

323
8.6 OS-level session interfaces

Capture Session buffer_bytes


class function
Argument

size Area

Return value
size_t Number of bytes needed for pixel buffer of specified size

Capture Session screen_size

Request current screen size pure virtual const method

Return value
Area

Capture Session screen_size_sigh

Register signal handler to be notified whenever the screen size pure virtual method
changes
Argument

- Signal_context_capability

Capture Session buffer

Define dimensions of the shared pixel buffer pure virtual method

Argument

size Area

Exceptions

Out_of_ram Session quota does not suffice for specified buffer dimensions
Out_of_caps

The size controls the server-side allocation of the shared pixel buffer and may affect
the screen size of the GUI server. The screen size is bounding box of the pixel buffers
of all capture clients.

324
8.6 OS-level session interfaces

Capture Session dataspace

Request dataspace of the shared pixel buffer defined via buffer pure virtual method

Return value
Dataspace_capability

Capture Session capture_at

Update the pixel-buffer with content at the specified screen posi- pure virtual method
tion
Argument

- Point

Return value
Affected_rects Geometry information about the content that changed since the pre-
vious call of capture_at

Capture Connection
class
Connection <Session> Session_client

Capture::Connection

Connection(. . .)
buffer(. . .)

Header
repos/os/include/capture_session/connection.h

Capture Connection Connection


constructor
Arguments

env Env &

label Label const &


Default is Label()

325
8.6 OS-level session interfaces

8.6.7. Platform session interface

Platform Device
class
Platform::Device

Device(. . .)
Device(. . .)
Device(. . .)
~Device()

Header
repos/os/include/platform_session/device.h

Platform Device Device


constructor
Argument

platform Connection &

Platform Device Device


constructor
Arguments

platform Connection &

type Type

Platform Device Device


constructor
Arguments

platform Connection &

name Name

326
8.6 OS-level session interfaces

Platform Session
class
Session

Platform::Session

~Session()
devices_rom() : Rom_session_capability
acquire_device(. . .) : Capability<Device_interface>
RPC interface
acquire_single_device() : Capability<Device_interface>
release_device(. . .)
alloc_dma_buffer(. . .) : Ram_dataspace_capability
free_dma_buffer(. . .)
dma_addr(. . .) : addr_t

Header
repos/os/include/platform_session/platform_session.h

Platform Session devices_rom

Request ROM session containing the information about available pure virtual method
devices.
Return value
Rom_session_capability Capability to ROM dataspace

Platform Session acquire_device

Acquire device known by unique name pure virtual method

Argument

name Device_name const &

Return value
Capability<Device_interface>

Platform Session acquire_single_device

Acquire the first resp. single device of this session pure virtual method

Return value
Capability<Device_interface>

327
8.6 OS-level session interfaces

Platform Session release_device

Free server-internal data structures representing the device pure virtual method

Argument

device Capability<Device_interface>

Use this method to relax the resource-allocation of the Platform session.


Platform Session alloc_dma_buffer

Allocate memory suitable for DMA pure virtual method

Arguments

- size_t

- Cache

Return value
Ram_dataspace_capability

Platform Session free_dma_buffer

Free previously allocated DMA memory pure virtual method

Argument

- Ram_dataspace_capability

Platform Session dma_addr


pure virtual method
Argument

- Ram_dataspace_capability

Return value
addr_t The bus address of the previously allocated DMA memory

328
8.6 OS-level session interfaces

8.6.8. Block session interface

The block-session interface provides block-device-level access to persistent storage


(Section 4.5.9). At the client side, the Block::Connection provides a job interface to is-
sue arbitrary block requests to a server in an asynchronous fashion. At the server side,
the Block::Request_stream API facilitates the implementation of asynchronously
block services such as block-device drivers. Both client and server share the same
notion of a Block::Request.
Block Operation
class
Block::Operation

valid() : bool
has_payload(. . .) : bool
type_name(. . .) : char const *
print(. . .)

Accessor
valid bool

Header
repos/os/include/block/request.h

Block Operation has_payload


class function
Argument

type Type

Return value
bool

Block Operation type_name


class function
Argument

type Type

Return value
char const *

329
8.6 OS-level session interfaces

Block Operation print


const method
Argument

out Output &

Block Request
class
Block::Request

Header
repos/os/include/block/request.h

Client-side block connection

Block Connection
class template
Connection <Session> Session_client

JOB
Block::Connection

Connection(. . .)
sigh(. . .)
update_jobs(. . .) : bool
dissolve_all_jobs(. . .)

Template argument

JOB typename
Default is void

Header
repos/os/include/block_session/connection.h

330
8.6 OS-level session interfaces

Block Connection Connection


constructor
Arguments

env Env &

tx_block_alloc Range_allocator *

tx_buf_size size_t
Size of transmission buffer in bytes
Default is 128*1024
label Label const &
Default is Label()

Block Connection sigh

Register handler for data-flow signals method

Argument

sigh Signal_context_capability

The handler is triggered on the arrival of new acknowledgements or when the server
becomes ready for new requests. It is thereby able to execute update_jobs on these
conditions.
Block Connection update_jobs

Handle the submission and completion of block-operation jobs method

Argument

policy auto &

Return value
bool True if progress was made

Block Connection dissolve_all_jobs

Call fn with each job as argument method

Argument

fn auto const &

This method is intended for the destruction of the jobs associated with the connection
before destructing the Connection object.

Server-side block-request stream To simplify the implementation of robust and


scalable block servers, there exists a server-side block-session front end called block-

331
8.6 OS-level session interfaces

request stream. It is designed with the following considerations:

• It anticipates the asynchronous operation of block servers by default. Using the


API, such servers - in particular block-device drivers - can be implemented as
state machines triggered by client requests and device interrupts.

• It reinforces the memory safety of the server code by not returning any pointers
or references.

• It relieves the server developers from handling special cases (like a congested ac-
knowledgement queue) while being flexible enough to accommodate different
categories of components like drivers, resource multiplexers (part_block), and
bump-in-the-wire components in a natural way.

• It naturally supports the batching of requests as well as zero-copy (device DMA


directly into the client’s communication buffer).

The use of the block-request stream API is illustrated by the example at os/src/test/block_re-
quest_stream/.

Block Request_stream
Stream of block-operation requests class

Block::Request_stream

Request_stream(. . .)
~Request_stream()
tx_cap() : Genode::Capability<Block::Session::Tx>
info() : Block::Session::Info
with_payload(. . .)
with_content(. . .)

Accessor
info Block::Session::Info

Header
repos/os/include/block/request_stream.h

332
8.6 OS-level session interfaces

Block Request_stream Request_stream


constructor
Arguments

rm Region_map &

ds Dataspace_capability

ep Entrypoint &

sigh Signal_context_capability

info Info const

Block Request_stream tx_cap


method
Return value
Genode::Capability<Block::Session::Tx>

Block Request_stream with_payload

Call functor fn with Payload interface as argument const method

Argument

fn auto const &

The Payload interface allows the functor to access the content of a request by calling
Payload::with_content.

Block Request_stream with_content

Call functor fn with the pointer and size to the request content const method

Arguments

request Request const &

fn auto const &

This is a wrapper for Payload::with_content. It is convenient in situations where the


Payload interface does not need to be propagated as argument.

333
8.6 OS-level session interfaces

8.6.9. Timer session interface

Timer Session
Timer session interface class

Session

Timer::Session

~Session()
trigger_once(. . .)
trigger_periodic(. . .)
RPC interface
sigh(. . .)
elapsed_ms() : uint64_t
elapsed_us() : uint64_t
msleep(. . .)
usleep(. . .)

Accessor
elapsed_us uint64_t

Header
repos/base/include/timer_session/timer_session.h

Timer Session trigger_once

Program single timeout (relative from now in microseconds) pure virtual method

Argument

us uint64_t

Timer Session trigger_periodic

Program periodic timeout (in microseconds) pure virtual method

Argument

us uint64_t

The first period will be triggered after us at the latest, but it might be triggered earlier
as well. The us value 0 disables periodic timeouts.

334
8.6 OS-level session interfaces

Timer Session sigh

Register timeout signal handler pure virtual method

Argument

sigh Signal_context_capability

Timer Session elapsed_ms


pure virtual const method
Return value
uint64_t Number of elapsed milliseconds since session creation

Timer Session msleep

Client-side convenience method for sleeping the specified num- pure virtual method
ber of milliseconds
Argument

ms uint64_t

Timer Session usleep

Client-side convenience method for sleeping the specified num- pure virtual method
ber of microseconds
Argument

us uint64_t

335
8.6 OS-level session interfaces

Timer Connection
Connection to timer service and timeout scheduler class

Connection <Session> Session_client

Timer::Connection

Connection(. . .)
Connection(. . .)
~Connection()
sigh(. . .)
usleep(. . .)
msleep(. . .)
curr_time() : Duration

Header
repos/base/include/timer_session/connection.h

Multiplexes a timer session amongst different timeouts.


Timer Connection Connection
constructor
Arguments

env Env &


Environment used for construction (e. g. quota trading)
ep Entrypoint &
Entrypoint used as timeout handler execution context
label Label const &
Optional label used in session routing
Default is Label()

Timer Connection Connection

Convenience constructor wrapper using the environment’s en- constructor


trypoint as timeout handler execution context
Arguments

env Env &

label Label const &


Default is Label()

336
8.6 OS-level session interfaces

Timer Connection sigh


method
Argument

sigh Signal_context_capability

Timer Connection usleep


method
Argument

us uint64_t

Timer Connection msleep


method
Argument

ms uint64_t

The Periodic_timeout and One_shot_timeout classes provide a convenient API


for implementing timeout handlers, following the same pattern as used for signal han-
dlers (Section 8.14).
Timer Periodic_timeout
class template
Periodic timeout that is linked to a custom handler, scheduled when constructed

HANDLER
Timer::Periodic_timeout

handle_timeout(. . .)

Template argument

HANDLER typename

Header
repos/base/include/timer_session/connection.h

337
8.6 OS-level session interfaces

Timer One_shot_timeout
One-shot timeout that is linked to a custom handler, scheduled manually class template

HANDLER
Timer::One_shot_timeout

Template argument

HANDLER typename

Header
repos/base/include/timer_session/connection.h

338
8.6 OS-level session interfaces

8.6.10. NIC and uplink session interfaces

Nic Session
NIC session interface class

Session

Nic::Session

~Session()
mac_address() : Mac_address
tx_channel() : Tx *
RPC interface
rx_channel() : Rx *
tx() : Tx::Source *
rx() : Rx::Sink *
link_state() : bool
link_state_sigh(. . .)

Header
repos/os/include/nic_session/nic_session.h

Nic Session mac_address

Request MAC address of network adapter pure virtual method

Return value
Mac_address

Nic Session tx_channel

Request packet-transmission channel virtual method

Return value
Tx *

Nic Session rx_channel

Request packet-reception channel virtual method

Return value
Rx *

339
8.6 OS-level session interfaces

Nic Session tx

Request client-side packet-stream interface of tx channel virtual method

Return value
Tx::Source *

Nic Session rx

Request client-side packet-stream interface of rx channel virtual method

Return value
Rx::Sink *

Nic Session link_state

Request current link state of network adapter (true means link pure virtual method
detected)
Return value
bool

Nic Session link_state_sigh

Register signal handler for link state changes pure virtual method

Argument

sigh Signal_context_capability

Nic Connection
Connection to NIC service class

Connection <Session> Session_client

Nic::Connection

Connection(. . .)

Header
repos/os/include/nic_session/connection.h

340
8.6 OS-level session interfaces

Nic Connection Connection


constructor
Arguments

env Env &

tx_block_alloc Range_allocator *

tx_buf_size size_t
Size of transmission buffer in bytes
rx_buf_size size_t
Size of reception buffer in bytes
label Label const &
Default is Label()

Uplink Session
Uplink session interface class

Session

Uplink::Session

~Session()
tx_channel() : Tx * RPC interface
rx_channel() : Rx *
tx() : Tx::Source *
rx() : Rx::Sink *

Header
repos/os/include/uplink_session/uplink_session.h

341
8.6 OS-level session interfaces

Uplink Session tx_channel

Request packet-transmission channel virtual method

Return value
Tx *

Uplink Session rx_channel

Request packet-reception channel virtual method

Return value
Rx *

Uplink Session tx

Request client-side packet-stream interface of tx channel virtual method

Return value
Tx::Source *

Uplink Session rx

Request client-side packet-stream interface of rx channel virtual method

Return value
Rx::Sink *

Uplink Connection
Connection to Uplink service class

Connection <Session> Session_client

Uplink::Connection

Connection(. . .)

Header
repos/os/include/uplink_session/connection.h

342
8.6 OS-level session interfaces

Uplink Connection Connection


constructor
Arguments

env Env &

tx_block_alloc Range_allocator *

tx_buf_size size_t
Size of transmission buffer in bytes
rx_buf_size size_t
Size of reception buffer in bytes
mac_address Mac_address const &

label Label const &


Default is Label()

343
8.6 OS-level session interfaces

8.6.11. Record and play session interfaces

The record and play session (Section 4.5.13) interfaces allow for the streaming of au-
dio data between components. Each session corresponds to a channel. Hence, stereo
output requires two play sessions. Both services are typically provided by a mixer that
routes data streams from play clients to record clients. Sample values are represented
as floating-point numbers and communicated over shared memory.

Play Session
Audio-play session interface class

Session

Play::Session RPC interface

Header
repos/os/include/play_session/play_session.h

Play Connection
Connection to audio-play service class

Connection <Session> Rpc_client <Session>

Play::Connection

Connection(. . .)
schedule_and_enqueue(. . .) : Time_window
enqueue(. . .)
stop()

Header
repos/os/include/play_session/connection.h

344
8.6 OS-level session interfaces

Play Connection Connection


constructor
Arguments

env Env &

label Label const &


Default is Label()

Play Connection schedule_and_enqueue

Schedule playback of data after the given previous time window method

Arguments

previous Time_window
Time window returned by previous call, or a default-constructed
Time_window when starting

duration Duration
Duration of the sample data in microseconds
fn auto const &
Functor to be repeatedly called with float sample values

Return value
Time_window

The sample rate depends on the given duration and the number of fn calls in the scope
of the schedule call. Note that the duration is evaluated only as a hint when starting a
new playback. During continuous playback, the duration is inferred by the rate of the
periodic schedule_and_enqueue calls.
Play Connection enqueue

Passively enqueue data for the playback at the given time win- method
dow
Arguments

time_window Time_window

fn auto const &

In contrast to schedule_and_enqueue, this method does not allocate a new time win-
dow but schedules sample data for an already known time window. It is designated
for the synchronized playback of multiple audio channels whereas each channel is a
separate play session.
One channel (e. g., the left) drives the allocation of time windows using schedule_and_enqueue
whereas the other channels (e. g., the right) merely submit audio data for the already

345
8.6 OS-level session interfaces

allocated time windows using enqueue. This way, only one RPC per period is needed
to drive the synchronized output of an arbitrary number of channels.
Play Connection stop

Inform the server that no further data is expected method

By calling stop, the client allows the server to distinguish the (temporary) end of play-
back from jitter.

346
8.6 OS-level session interfaces

Record Session
Audio-record session interface class

Session

Record::Session RPC interface

Header
repos/os/include/record_session/record_session.h

Record Connection
Connection to an audio-record service class

Connection <Session> Rpc_client <Session>

Record::Connection

Connection(. . .)
wakeup_sigh(. . .)
record(. . .)
record_at(. . .)

Header
repos/os/include/record_session/connection.h

347
8.6 OS-level session interfaces

Record Connection Connection


constructor
Arguments

env Env &

label Session_label const &


Default is Session_label()

Record Connection wakeup_sigh

Register signal handler on new data becomes available after de- method
pletion
Argument

sigh Signal_context_capability

Record Connection record

Record the specified number of audio samples method

Arguments

n Num_samples

fn auto const &


Called with the Time_window and Samples_ptr const & of the record-
ing
depleted_fn auto const &
Called when no sample data is available

Subsequent record calls result in consecutive time windows.


Record Connection record_at

Record specified number of audio samples at the given time win- method
dow
Arguments

tw Time_window

n Num_samples

fn auto const &

By using the time window returned by record as argument for record_at, a user of
multiple sessions (e. g., for left and right) can obtain sample data synchronized between
the sessions.

348
8.6 OS-level session interfaces

8.6.12. File-system session interface

The file-system session (Section 4.5.14) interface provides to store and retrieve data in
the form of files organized in a hierarchic directory structure. Directory operations are
performed via a synchronous RPC interface whereas the actual read and write opera-
tions are performed asynchronously using a packet stream.

File_system
namespace
File-system session interface
Types

Node_handle is defined as Node::Id


File_handle is defined as File::Id
Dir_handle is defined as Directory::Id
Symlink_handle is defined as Symlink::Id
Watch_handle is defined as Watch::Id
Genode::size_t is an enumeration type
seek_off_t is defined as Genode::uint64_t
file_size_t is defined as Genode::uint64_t
Out_of_ram is defined as Genode::Out_of_ram
Out_of_caps is defined as Genode::Out_of_caps
Mode is an enumeration type
Flags as supplied to file, dir, and symlink calls

Name is defined as Genode::Rpc_in_buffer<MAX_NAME_LEN>


Path is defined as Genode::Rpc_in_buffer<MAX_PATH_LEN>
Exception is a subtype of Genode::Exception

Header
repos/os/include/file_system_session/file_system_session.h

A file-system client issues read or write requests by submitting packet descriptors to


the file-system session’s packet stream. Each packet descriptor contains all parameters
of the transaction including the type of operation, the seek offset, and the length.

349
8.6 OS-level session interfaces

File_system Packet_descriptor
class
Packet_descriptor

File_system::Packet_descriptor

Packet_descriptor(. . .)
Packet_descriptor(. . .)
Packet_descriptor(. . .)
Packet_descriptor(. . .)
handle() : Node_handle
operation() : Opcode
position() : seek_off_t
length() : size_t
succeeded() : bool
with_timestamp(. . .)
succeeded(. . .)
length(. . .)

Accessors
handle Node_handle
operation Opcode
position seek_off_t
length size_t
succeeded bool

Header
repos/os/include/file_system_session/file_system_session.h

350
8.6 OS-level session interfaces

File_system Packet_descriptor Packet_descriptor


constructor
Arguments

buf_offset off_t
Default is 0
buf_size size_t
Default is 0

File_system Packet_descriptor Packet_descriptor


constructor
Arguments

p Packet_descriptor

handle Node_handle

op Opcode

length size_t

position seek_off_t
Seek offset in bytes
Default is SEEK_TAIL

Note, if position is set to SEEK_TAIL read operations will read length bytes from the
end of the file while write operations will append length bytes at the end of the file.
File_system Packet_descriptor Packet_descriptor
constructor
Arguments

handle Node_handle

op Opcode

This constructor provided for sending server-side notification packets.

351
8.6 OS-level session interfaces

File_system Packet_descriptor Packet_descriptor


constructor
Arguments

p Packet_descriptor

handle Node_handle

op Opcode

mtime Timestamp const &

File_system Packet_descriptor with_timestamp


const method
Argument

fn auto const &

File_system Packet_descriptor succeeded


method
Argument

b bool

File_system Packet_descriptor length


method
Argument

length size_t

352
8.6 OS-level session interfaces

File_system Session
class
Session

File_system::Session

~Session()
tx() : Tx::Source *
file(. . .) : File_handle
symlink(. . .) : Symlink_handle
dir(. . .) : Dir_handle
node(. . .) : Node_handle
RPC interface
watch(. . .) : Watch_handle
close(. . .)
status(. . .) : Status
control(. . .)
unlink(. . .)
truncate(. . .)
move(. . .)
num_entries(. . .) : unsigned

Header
repos/os/include/file_system_session/file_system_session.h

353
8.6 OS-level session interfaces

File_system Session tx

Request client-side packet-stream interface of tx channel virtual method

Return value
Tx::Source *

File_system Session file

Open or create file pure virtual method

Arguments

- Dir_handle

name Name const &

- Mode

create bool

Exceptions

Invalid_handle Directory handle is invalid


Invalid_name File name contains invalid characters
Lookup_failed The name refers to a node other than a file
Node_already_exists File cannot be created because a node with the same name al-
ready exists
No_space Storage exhausted
Out_of_ram Server cannot allocate metadata
Out_of_caps
Permission_denied
Unavailable Directory vanished

Return value
File_handle

354
8.6 OS-level session interfaces

File_system Session symlink

Open or create symlink pure virtual method

Arguments

- Dir_handle

name Name const &

create bool

Exceptions

Invalid_handle Directory handle is invalid


Invalid_name Symlink name contains invalid characters
Lookup_failed The name refers to a node other than a symlink
Node_already_exists Symlink cannot be created because a node with the same name
already exists
No_space Storage exhausted
Out_of_ram Server cannot allocate metadata
Out_of_caps
Permission_denied
Unavailable Directory vanished

Return value
Symlink_handle

355
8.6 OS-level session interfaces

File_system Session dir

Open or create directory pure virtual method

Arguments

path Path const &

create bool

Exceptions

Lookup_failed Path lookup failed because one element of path does not exist
Name_too_long path is too long
Node_already_exists Directory cannot be created because a node with the same
name already exists
No_space Storage exhausted
Out_of_ram Server cannot allocate metadata
Out_of_caps
Permission_denied

Return value
Dir_handle

File_system Session node

Open existing node pure virtual method

Argument

path Path const &

Exceptions

Lookup_failed Path lookup failed because one element of path does not exist
Out_of_ram Server cannot allocate metadata
Out_of_caps

Return value
Node_handle

The returned node handle can be used merely as argument for status.

356
8.6 OS-level session interfaces

File_system Session watch

Watch a node for for changes. pure virtual method

Argument

path Path const &

Exceptions

Lookup_failed Path lookup failed because one element of path does not exist
Out_of_ram Server cannot allocate metadata
Out_of_caps
Unavailable File-system is static or does not support notifications

Return value
Watch_handle

When changes are made to the node at this path a CONTENT_CHANGED packet will
be sent from the server to the client.
The returned node handle is used to identify notification packets.
File_system Session close

Close file pure virtual method


Argument

- Node_handle

Exception

Invalid_handle Node handle is invalid

File_system Session status

Request information about an open file or directory pure virtual method

Argument

- Node_handle

Exceptions

Invalid_handle Node handle is invalid


Unavailable Node vanished

Return value
Status

357
8.6 OS-level session interfaces

File_system Session control

Set information about an open file or directory pure virtual method

Arguments

- Node_handle

- Control

Exceptions

Invalid_handle Node handle is invalid


Unavailable Node vanished

File_system Session unlink

Delete file or directory pure virtual method

Arguments

dir Dir_handle

name Name const &

Exceptions

Invalid_handle Directory handle is invalid


Invalid_name name contains invalid characters
Lookup_failed Lookup of name in dir failed
Not_empty Argument is a non-empty directory and the backend does not
support recursion
Permission_denied
Unavailable Directory vanished

File_system Session truncate

Truncate or grow file to specified size pure virtual method

Arguments

- File_handle

size file_size_t

Exceptions

Invalid_handle Node handle is invalid


No_space New size exceeds free space
Permission_denied Node modification not allowed
Unavailable Node vanished

358
8.6 OS-level session interfaces

File_system Session move

Move and rename directory entry pure virtual method

Arguments

- Dir_handle

from Name const &

- Dir_handle

to Name const &

Exceptions

Invalid_handle A directory handle is invalid


Invalid_name to contains invalid characters
Lookup_failed from not found
Permission_denied Node modification not allowed
Unavailable A directory vanished

File_system Session num_entries


pure virtual method
Argument

- Dir_handle

Exception

Invalid_handle The directory handle is invalid

Return value
unsigned Number of directory entries

359
8.6 OS-level session interfaces

File_system Connection
The base implementation of a File_system connection class

Connection <Session> Session_client

File_system::Connection

decltype(. . .)
Connection(. . .)
dir(. . .) : Dir_handle
file(. . .) : File_handle
symlink(. . .) : Symlink_handle
node(. . .) : Node_handle
watch(. . .) : Watch_handle

Header
repos/os/include/file_system_session/connection.h

360
8.6 OS-level session interfaces

File_system Connection decltype


constructor
Argument

- fn()

File_system Connection Connection


constructor
Arguments

env Env &

tx_block_alloc Range_allocator &

label Label const &


Session label
Default is Label()
root char const *
Root directory of session
Default is "/"
writeable bool
Session is writeable
Default is true
tx_buf_size size_t
Size of transmission buffer in bytes
Default is DEFAULT_TX_BUF_SIZE

361
8.6 OS-level session interfaces

The file-system session’s status and control operations use the compound struc-
tures Status and Control as arguments. The format of the data retrieved by reading a
directory node is defined by the Directory_entry.

File_system Status
class
File_system::Status

directory() : bool
symlink() : bool

Header
repos/os/include/file_system_session/file_system_session.h

File_system Status directory


const method
Return value
bool True if node is a directory

File_system Status symlink


const method
Return value
bool True if node is a symbolic link

File_system Control
class
File_system::Control

Header
repos/os/include/file_system_session/file_system_session.h

362
8.6 OS-level session interfaces

File_system Directory_entry
Data structure returned when reading from a directory node class

File_system::Directory_entry

sanitize()

Header
repos/os/include/file_system_session/file_system_session.h

File_system Directory_entry sanitize

Sanitize object received from a file-system server as plain bytes method

363
8.6 OS-level session interfaces

8.6.13. Pin state and control session interfaces

A pin-state or pin-control session (Section 4.5.8) allows its client to interact with an
individual GPIO pin. The pin is referred to by the client’s session label.

Pin_state Session
Session interface to obtain a GPIO pin state class

Session

Pin_state::Session
RPC interface
state() : bool

Accessor
state bool

Header
repos/os/include/pin_state_session/pin_state_session.h

Pin_state Connection
Connection to pin-state service class

Pin_state::Connection

Connection(. . .)
state() : bool

Accessor
state bool

Header
repos/os/include/pin_state_session/connection.h

364
8.6 OS-level session interfaces

Pin_state Connection Connection


constructor
Arguments

env Env &

label Label const &


Default is Label()

Pin control The state RPC function sets the digital output to high or low, which is
the most common use.
To accommodate the time-multiplexed operation of a pin as output and input, the
yield RPC function switches the pin to high-impedance. It can thereby be driven as
input pin and its state can be monitored via a separate pin-state session referring to the
same pin as the pin-control session. This is useful for implementing two-wire protocols
like I2C in software.
Pin_control Session
Session interface to control a GPIO pin class

Session

Pin_control::Session

RPC interface
state(. . .)
yield()

Header
repos/os/include/pin_control_session/pin_control_session.h

Pin_control Session state


pure virtual method
Argument

- bool

Pin_control Session yield


pure virtual method

365
8.6 OS-level session interfaces

Pin_control Connection
Connection to pin-control service class

Pin_control::Connection

Connection(. . .)
state(. . .)
yield()

Header
repos/os/include/pin_control_session/connection.h

Pin_control Connection Connection


constructor
Arguments

env Env &

label Label const &


Default is Label()

366
8.7 Fundamental types

8.7. Fundamental types

8.7.1. Integer types

Genode provides common integer types in its namespace. Integer types that can be
derived from built-in compiler types are defined in base/stdint.h and base/fixed_stdint.h.
Whereas the former is independent from the machine type, the latter differs between
32-bit and 64-bit architectures.
Genode
namespace
Integer types
Types

size_t is defined as unsigned long


Integer type for non-negative size values

addr_t is defined as unsigned long


Integer type for memory addresses

off_t is defined as long


Integer type for memory offset values

umword_t is defined as unsigned long


Integer type corresponding to a machine register

Header
repos/base/include/base/stdint.h

367
8.7 Fundamental types

The fixed-width integer types for 32-bit architectures are defined as follows.

Fixed-width integer types for 32-bit architectures root namespace


Types

genode_int8_t is defined as signed char


genode_uint8_t is defined as unsigned char
genode_int16_t is defined as signed short
genode_uint16_t is defined as unsigned short
genode_int32_t is defined as signed
genode_uint32_t is defined as unsigned
genode_int64_t is defined as signed long long
genode_uint64_t is defined as unsigned long long

Header
repos/base/include/spec/32bit/base/fixed_stdint.h

Genode
namespace
Fixed-width integer types for 32-bit architectures
Types

int8_t is defined as genode_int8_t


uint8_t is defined as genode_uint8_t
int16_t is defined as genode_int16_t
uint16_t is defined as genode_uint16_t
int32_t is defined as genode_int32_t
uint32_t is defined as genode_uint32_t
int64_t is defined as genode_int64_t
uint64_t is defined as genode_uint64_t

Header
repos/base/include/spec/32bit/base/fixed_stdint.h

368
8.7 Fundamental types

The fixed-width integer types for 64-bit architectures are defined as follows.

Fixed-width integer types for 64-bit architectures root namespace


Types

genode_int8_t is defined as signed char


genode_uint8_t is defined as unsigned char
genode_int16_t is defined as signed short
genode_uint16_t is defined as unsigned short
genode_int32_t is defined as signed
genode_uint32_t is defined as unsigned
genode_int64_t is defined as signed long long
genode_uint64_t is defined as unsigned long long

Header
repos/base/include/spec/64bit/base/fixed_stdint.h

Genode
namespace
Fixed-width integer types for 64-bit architectures
Types

int8_t is defined as genode_int8_t


uint8_t is defined as genode_uint8_t
int16_t is defined as genode_int16_t
uint16_t is defined as genode_uint16_t
int32_t is defined as genode_int32_t
uint32_t is defined as genode_uint32_t
int64_t is defined as genode_int64_t
uint64_t is defined as genode_uint64_t

Header
repos/base/include/spec/64bit/base/fixed_stdint.h

369
8.7 Fundamental types

8.7.2. Exception types

Genode facilitates the use of exceptions to signal errors but it uses exception types
only as textual expression of error code and for grouping errors. Normally, exceptions
do not carry payload. For code consistency, exception types should inherit from the
Exception base class. By convention, exceptions carry no payload.

Genode Exception
Exception base class class

Exception

Header
repos/base/include/base/exception.h

8.7.3. Exception-less error handling

Genode generally employs C++ exceptions for propagating errors, which is true to the
language. The rationale behind the use of exceptions is given in Section 7.2.1.
However, as the mechanics of C++ exceptions are built upon a baseline of function-
ality like the presence of a memory allocator for allocating exception headers, no low-
level code path involved in memory allocations must depend on the exception mech-
anism. With exceptions not being available as a means to reflect error conditions for
those parts of framework, an alternative error-handling mechanism is required.
Traditional approaches like overloading return values with error codes, or the use
of out parameters to carry return values are error prone. Genode’s Attempt utility
provides a safe alternative in the spirit of option types.

370
8.7 Fundamental types

Genode Attempt
Option type for return values class template

RESULT,
ERROR
Attempt

Attempt(. . .)
Attempt(. . .)

Template arguments

RESULT typename

ERROR typename

Header
repos/base/include/util/attempt.h

The Attempt type addresses the C++ limitation to only a single return value, which
makes it difficult to propagate error information in addition to an actual return value
from a called function back to the caller. Hence, errors have to be propagated as excep-
tions, results have to be returned via out parameters, or error codes have to be encoded
in the form of magic values. Each of these approaches create its own set of robustness
problems.
An Attempt represents the result of a function call that is either a meaningful value or
an error code, but never both. The result value and error are distinct types. To consume
the return value of a call, the caller needs to specify two functors, one for handing the
value if the value exists (the call was successful), and one for handling the error value
if the call failed. Thereby the use of an Attempt return type reinforces the explicit han-
dling of all possible error conditions at the caller site.
Genode Attempt Attempt
constructor
Argument

result RESULT

Genode Attempt Attempt


constructor
Argument

error ERROR

371
8.7 Fundamental types

Its name reflects is designated use as a carrier for return values. To illustrate its use,
here is slightly abbreviated snippet of Genode’s Ram_allocator interface:

struct Ram_allocator : Interface


{
enum class Alloc_error { OUT_OF_RAM, OUT_OF_CAPS, DENIED };

using Alloc_result = Attempt<Ram_dataspace_capability, Alloc_error>;

virtual Alloc_result try_alloc(size_t size) = 0;

...
};

The Alloc_error type models the possible error conditions, which would normally
be represented as exception types. The Alloc_result type describes the return value
of the try_alloc method by using the Attempt utility. Whenever try_alloc succeeds,
the value will hold a capability (referring to a valid RAM dataspace). Otherwise, it will
hold an error value of type Alloc_error.
At the caller side, the Attempt utility is extremely rigid. The caller can access the
value only when providing both a handler for the value and a handler for the error
code. For example, with ram being a reference to a Ram_allocator, a call to try_alloc
may look like this:

372
8.7 Fundamental types

ram.try_alloc(size).with_result(

[&] (Ram_dataspace_capability ds) {


...
},

[&] (Alloc_error error) {


switch (error) {

case Alloc_error::OUT_OF_RAM:
_request_ram_from_parent(size);
break;

case Alloc_error::OUT_OF_CAPS:
_request_caps_from_parent(4);
break;

case Alloc_error::DENIED:
...
break;
}
});

Which of both lambda functions gets called depends on the success of the try_alloc
call. The value returned by try_alloc is only reachable by the code in the scope of
the first lambda function. The code within this scope can rely on the validity of the
argument.
By expressing error codes as an enum class, we let the compiler assist us to cover all
possible error cases (using switch). This a is nice benefit over the use of exceptions,
which are unfortunately not covered by function/method signatures. By using the
Attempt utility, we implicitly tie functions together with their error conditions using
C++ types. As another benefit over catch handlers, the use of switch allows us to share
error handing code for different conditions by grouping case statements.
Note that in the example above, the valid ds object cannot leave the scope of its
lambda function. Sometimes, however, we need to pass a return value along a chain of
callers. This situation is covered by the Attempt::convert method. Analogously
to with_result, it takes two lambda functions as arguments. But in contrast to
with_result, both lambda functions return a value of the same type. This natu-
rally confronts the programmer with the question of how to convert all possible errors
to this specific type. If this question cannot be answered for all error cases, the design
of the code is most likely flawed.

373
8.7 Fundamental types

8.7.4. C++ supplements

Genode Noncopyable
Classes of objects not allowed to be copied, should inherit from this one. class

Noncopyable

Header
repos/base/include/util/noncopyable.h

This class declares a private copy-constructor and assignment-operator. It’s sufficient


to inherit private from this class, and let the compiler detect any copy violations.

374
8.8 Data structures

8.8. Data structures

The framework API features a small library of data structures that are solely exist to
support the framework implementation. They have been designed under the following
considerations:

• They should be as simple as possible to make them easy to evaluate for correct-
ness. Low complexity takes precedence over performance.

• Each data structure provides a rigid interface that is targeted at a specific use
case and does not attempt to be a power tool. For example, the Fifo deliber-
ately provides no way to randomly access elements, the List merely provides
the functionality to remember elements but lacks list-manipulation operations.

• Data structures perform no hidden anonymous memory allocations by storing


meta data intrusively. This is precondition for allowing resources multiplexers
and runtime environments to properly account their local memory allocations.
Section 3.3.3 provides the rationale behind the need for full control over memory
allocations.

375
8.8 Data structures

8.8.1. List and registry

Most book-keeping tasks in Genode rely on single-connected lists, which use the List
template.

Genode List
Single-connected list class template

LT
List

List()
first() : LT *
first() : LT const *
insert(. . .)
remove(. . .)

Template argument

LT typename
List element type

Accessor
first LT const *

Header
repos/base/include/util/list.h

Genode List first


method
Return value
LT * First list element

Genode List insert

Insert element after specified element into list method

Arguments

le LT const *
List element to insert
at LT const *
Target position (preceding list element)
Default is 0

376
8.8 Data structures

Genode List remove

Remove element from list method


Argument

le LT const *

Genode List_element
Helper for using member variables as list elements class template

List <List_element<T> >

T
List_element

List_element(. . .)
object() : T *

Template argument

T typename
Type of compound object to be organized in a list

Accessor
object T *

Header
repos/base/include/util/list.h

This helper allow the creation of lists that use member variables to connect their ele-
ments. This way, the organized type does not need to publicly inherit List<LT>::Element.
Furthermore objects can easily be organized in multiple lists by embedding multiple
List_element member variables.
Genode List_element List_element
constructor
Argument

object T *

Registry Most commonly, lists are used as containers that solely remember dynam-
ically created objects. In this use case, the lifetime of the remembered object is tightly

377
8.8 Data structures

coupled with its presence in the list. The Registry class template represents a safe
wrapper around the raw list that ensures this presumption and thereby eliminates
classes of list-related bugs by design, e. g., double insertion, missing removal.
An object type to be remembered in a Registry inherits the Registry::Element
base class, which takes its registry and a reference to the object itself as arguments.
Thereby, the relationship of the object with its corresponding registry is fixed at the
object’s construction time. Once constructed, it is implicitly part of the registry. The
registry provides a for_each method to iterate over its elements. Unlike the traversal
of a raw list, this for_each operation is thread safe. It also supports the safe destruction
of the currently visited object.

Genode Registry
Thread-safe object registry class template

T
Registry

for_each(. . .)
for_each(. . .)

Template argument

T typename

Header
repos/base/include/base/registry.h

Genode Registry for_each


method
Argument

fn auto const &

Genode Registry for_each


const method
Argument

fn auto const &

As an alternative to the use of the Registry::Element base class, the Registered


and Registered_no_delete helper templates supplement arbitrary object types with
the ability to become registry elements. They wrap a given object type in a new type
whereby the original type remains untainted by the fact that its objects are kept in a

378
8.8 Data structures

registry.

Genode Registered
Convenience helper to equip a type T with a Registry::Element class template

T
Registered

Registered(. . .)

Template argument

T typename

Header
repos/base/include/base/registry.h

Using this helper, an arbitrary type can be turned into a registry element type. E.g.,
in order to keep Child_service objects in a registry, a new registry-compatible type
can be created via Registered<Child_service>. Objects of this type can be kept in
a Registry<Registered<Child_service> >. The constructor of such “registered”
objects expect the registry as the first argument. The other arguments are forwarded to
the constructor of the enclosed type.
Genode Registered Registered
constructor
Arguments

registry Registry<Registered<T> > &

args auto &&...

379
8.8 Data structures

Genode Registered_no_delete
Variant of Registered that does not require a vtable in the base class class template

T
Registered_no_delete

Registered_no_delete(. . .)

Template argument

T typename

Header
repos/base/include/base/registry.h

The generic Registered convenience class requires the base class to provide a
vtable resp. a virtual destructor for safe deletion of a base class pointer. By using
Registered_no_delete, this requirement can be lifted.

Genode Registered_no_delete Registered_no_delete


constructor
Arguments

registry Registry<Registered_no_delete<T> > &

args auto &&...

380
8.8 Data structures

8.8.2. Fifo queue

Because the List inserts new list elements at the list head, it cannot be used for im-
plementing wait queues requiring first-in-first-out semantics. For such use cases, there
exists a dedicated Fifo template.

Genode Fifo
First-in first-out (FIFO) queue class template

QT
Fifo

empty() : bool
Fifo()
head(. . .)
remove(. . .)
enqueue(. . .)
for_each(. . .)
dequeue(. . .)
dequeue_all(. . .)

Template argument

QT typename
Queue element type

Header
repos/base/include/util/fifo.h

Genode Fifo empty


const method
Return value
bool True if queue is empty

Genode Fifo head

Call fn of type void (QT&) the head element const method

Argument

fn auto const &

381
8.8 Data structures

Genode Fifo remove

Remove element explicitly from queue method

Argument

qe QT &

Genode Fifo enqueue

Attach element at the end of the queue method

Argument

e QT &

Genode Fifo for_each

Call fn of type void (QT&) for each element in order const method

Argument

fn auto const &

Genode Fifo dequeue

Remove head and call fn of type void (QT&) method

Argument

fn auto const &

Genode Fifo dequeue_all

Remove all fifo elements method


Argument

fn auto const &

This method removes all elements in order and calls the lambda fn of type void (QT&)
for each element. It is intended to be used prior the destruction of the FIFO.

382
8.8 Data structures

Genode Fifo_element
Helper for using member variables as FIFO elements class template

Fifo <Fifo_element<T> >

T
Fifo_element

Fifo_element(. . .)
object() : T &
object() : T const &

Template argument

T typename
Type of compound object to be organized in a FIFO

Accessor
object T const &

Header
repos/base/include/util/fifo.h

This helper allows the creation of FIFOs that use member variables to connect their el-
ements. This way, the organized type does not need to publicly inherit Fifo<QT>::Element.
Furthermore, objects can easily be organized in multiple FIFOs by embedding multiple
Fifo_element member variables.
Genode Fifo_element Fifo_element
constructor
Argument

object T &

Genode Fifo_element object


method
Return value
T &

383
8.8 Data structures

8.8.3. AVL tree

For use cases where associative arrays are needed such as allocators, there is class tem-
plate for creating AVL trees. The tree-balancing mechanism is implemented in the
Avl_node_base class. The actual Avl_node and Avl_tree classes are tagged with the
element type, which ensures that each AVL tree hosts only one type of elements.

Genode Avl_node_base
AVL tree class

Avl_node_base

Avl_node_base()
insert(. . .)
remove(. . .)

Header
repos/base/include/util/avl_tree.h

Genode Avl_node_base insert

Insert new node into subtree method


Arguments

node Avl_node_base *

policy Policy &

Genode Avl_node_base remove

Remove node from tree method


Argument

policy Policy &

384
8.8 Data structures

Genode Avl_node
AVL node class template

Avl_node_base

NT
Avl_node

child(. . .) : NT *
for_each(. . .)

Template argument

NT typename
Type of the class derived from Avl_node

Header
repos/base/include/util/avl_tree.h

Each object to be stored in the AVL tree must be derived from Avl_node. The type
of the derived class is to be specified as template argument to enable Avl_node to call
virtual methods specific for the derived class.
The NT class must implement a method called higher that takes a pointer to another
NT object as argument and returns a bool value. The bool value is true if the specified
node is higher or equal in the tree order.
Genode Avl_node child
const method
Argument

i Side

Return value
NT * Child of specified side, or nullptr if there is no child

This method can be called by the NT objects to traverse the tree.


Genode Avl_node for_each

Apply a functor (read-only) to every node within this subtree const method

Argument

fn auto const &


Function that takes a const NT reference

385
8.8 Data structures

Genode Avl_tree
Root node of the AVL tree class template

NT
Avl_tree

insert(. . .)
remove(. . .)
first() : NT *
for_each(. . .)

Template argument

NT typename

Header
repos/base/include/util/avl_tree.h

The real nodes are always attached at the left branch of this root node.
Genode Avl_tree insert

Insert node into AVL tree method


Argument

node Avl_node<NT> *

Genode Avl_tree remove

Remove node from AVL tree method


Argument

node Avl_node<NT> *

Genode Avl_tree first

Request first node of the tree const method

Return value
NT * First node, or nullptr if the tree is empty

386
8.8 Data structures

Genode Avl_tree for_each

Apply a functor (read-only) to every node within the tree const method

Argument

fn auto const &


Function that takes a const NT reference

The iteration order corresponds to the order of the keys

387
8.8 Data structures

8.8.4. Dictionary

The Dictionary provides an easy and safe way to organize objects by using strings as
keys. It alleviates the need to manually use the bare-bones AVL tree for this promi-
nent use case. The Dictionary attains its safety from following the principle design of
the Registry. That is, elements are automatically added to the dictionary at construc-
tion time, respectively removed at destruction time. The method with_element calls
a functor with one element by specifying its name as key, and the with_any_element
interface supports the orderly destruction of all dictionary items. These patterns limit
the exposure of individual dictionary elements to a local scope at the caller.

Genode Dictionary
class template
T, NAME
Dictionary

Template arguments

T typename

NAME typename

Header
repos/base/include/util/dictionary.h

8.8.5. ID space

Similar to how the Registry provides a safe wrapper around list’s most common use
case, the Id_space covers a prominent use case for AVL trees in a safeguarded fashion,
namely the association of objects with IDs. Internally, IDs are kept in an AVL tree but
that implementation detail remains hidden from the API. In contrast to a bit allocator,
the ID space can be sparsely populated and does not need to be dimensioned. The
lifetime of an ID is bound to an Element object, which relieves the programmer from
manually allocating/deallocating IDs for objects.

388
8.8 Data structures

Genode Id_space
class template
Noncopyable

T
Id_space

for_each(. . .)

Template argument

T typename

Header
repos/base/include/base/id_space.h

Genode Id_space for_each

Apply functor fn to each ID present in the ID space const method template

Template argument

ARG typename
Argument type passed to fn, must be convertible from T via a static_cast

Argument

fn auto const &

This function is called with the ID space locked. Hence, it is not possible to modify the
ID space from within fn.

8.8.6. Bit array

Bit arrays are typically used for the allocation of IDs. The Bit_array_base class op-
erates on a memory buffer specified to its constructor. The Bit_array class template
addresses the common case where the ID space is dimensioned at compile time. It takes
the number of bits as arguments and hosts the memory buffer for storing the bits within
the object.

389
8.8 Data structures

Genode Bit_array_base
Allocator using bitmaps class

Bit_array_base

Bit_array_base(. . .)
get(. . .) : bool
set(. . .)
clear(. . .)

Header
repos/base/include/util/bit_array.h

Genode Bit_array_base Bit_array_base


constructor
Arguments

bits unsigned

ptr addr_t *
Pointer to array used as backing store for the bits. The array must be initialized
with zeros.

Exception

Invalid_bit_count

Genode Bit_array_base get


const method
Arguments

index addr_t

width addr_t

Return value
bool True if at least one bit is set between index until index + width - 1

Genode Bit_array_base set


method
Arguments

index addr_t const

width addr_t const

390
8.8 Data structures

Genode Bit_array_base clear


method
Arguments

index addr_t const

width addr_t const

Genode Bit_array
Allocator using bitmaps class template

Bit_array_base

BITS
Bit_array

Bit_array()
Bit_array(. . .)

Template argument

BITS unsigned

Header
repos/base/include/util/bit_array.h

Genode Bit_array Bit_array


constructor
Argument

other Bit_array const &

391
8.9 Object lifetime management

8.9. Object lifetime management

8.9.1. Thread-safe weak pointers

Dangling pointers represent one of the most common cause for instabilities of software
written in C or C++. Such a situation happens when an object disappears while pointers
to the object are still in use. One way to solve this problem is to explicitly notify the
holders of those pointers about the disappearance of the object. But this would require
the object to keep references to those pointer holders, which, in turn, might disappear
as well. Consequently, this approach tends to become a complex solution, which is
prone to deadlocks or race conditions when multiple threads are involved.
The utilities provided by base/weak_ptr.h implement a more elegant pattern called
“weak pointers” to deal with such situations. An object that might disappear at any
time is represented by the Weak_object class template. It keeps track of a list of so-
called weak pointers pointing to the object. A weak pointer, in turn, holds privately
the pointer to the object alongside a validity flag. It cannot be used to dereference
the object. For accessing the actual object, a locked pointer must be created from a
weak pointer. If this creation succeeds, the object is guaranteed to be locked (not de-
structed) until the locked pointer gets destroyed. If the object no longer exists, the
locked pointer will become invalid. This condition can (and should) be detected via the
Locked_ptr::is_valid() function prior dereferencing the pointer.
In the event a weak object gets destructed, all weak pointers that point to the object
are automatically invalidated. So a subsequent conversion into a locked pointer will
yield an invalid pointer, which can be detected (in contrast to a dangling pointer).
To use this mechanism, the destruction of a weak object must be deferred until
no locked pointer points to the object anymore. This is done by calling the function
Weak_object::lock_for_destruction() at the beginning of the destructor of the
to-be-destructed object. When this function returns, all weak pointers to the object will
have been invalidated. So it is save to destruct and free the object.

Genode Weak_object_base
Type-agnostic base class of a weak object class

Weak_object_base

~Weak_object_base()
disassociate(. . .)
lock_for_destruction()

Header
repos/base/include/base/weak_ptr.h

392
8.9 Object lifetime management

Genode Weak_object_base disassociate


method
Argument

ptr Weak_ptr_base *

Genode Weak_object_base lock_for_destruction

Mark object as safe to be destructed method

This method must be called by the destructor of a weak object to defer the destruction
until no Locked_ptr is held to the object.

Genode Weak_object
Weak object class template

Weak_object_base

T
Weak_object

weak_ptr() : Weak_ptr<T>
weak_ptr_const() : Weak_ptr<T const> const

Template argument

T typename
Type of the derived class

Header
repos/base/include/base/weak_ptr.h

This class template must be inherited in order to equip an object with the weak-
pointer mechanism.

393
8.9 Object lifetime management

Genode Weak_object weak_ptr

Obtain a weak pointer referring to the weak object method

Return value
Weak_ptr<T>

Genode Weak_object weak_ptr_const

Const version of weak_ptr const method

Return value
Weak_ptr<T const> const

This function is useful in cases where the returned weak pointer is merely used for
comparison operations.

Genode Weak_ptr_base
Type-agnostic base class of a weak pointer class

List <Weak_ptr_base>

Weak_ptr_base

Weak_ptr_base()
~Weak_ptr_base()
operator =(. . .) : Weak_ptr_base &
operator ==(. . .) : bool

Header
repos/base/include/base/weak_ptr.h

This class implements the mechanics of the the Weak_ptr class template. It is not
used directly.

394
8.9 Object lifetime management

Genode Weak_ptr_base Weak_ptr_base

Default constructor, produces invalid pointer constructor

Genode Weak_ptr_base operator =

Assignment operator method

Argument

other Weak_ptr_base const &

Return value
Weak_ptr_base &

Genode Weak_ptr_base operator ==

Test for equality const method

Argument

other Weak_ptr_base const &

Return value
bool

Genode Weak_ptr
Weak pointer to a given type class template

Weak_ptr_base

T
Weak_ptr

Weak_ptr()
Weak_ptr(. . .)
operator =(. . .) : Weak_ptr &

Template argument

T typename

Header
repos/base/include/base/weak_ptr.h

395
8.9 Object lifetime management

A weak pointer can be obtained from a weak object (an object that inherits the
Weak_object class template) and safely survives the lifetime of the associated weak
object. If the weak object disappears, all weak pointers referring to the object are au-
tomatically invalidated. To avoid race conditions between the destruction and use of
a weak object, a weak pointer cannot be de-reference directly. To access the object, a
weak pointer must be turned into a locked pointer (Locked_ptr).
Genode Weak_ptr Weak_ptr

Default constructor creates invalid pointer constructor

Genode Weak_ptr Weak_ptr

Copy constructor constructor

Argument

other Weak_ptr<T> const &

Genode Weak_ptr operator =

Assignment operator method

Argument

other Weak_ptr<T> const &

Return value
Weak_ptr &

Genode Locked_ptr_base
class
Locked_ptr_base

Header
repos/base/include/base/weak_ptr.h

396
8.9 Object lifetime management

Genode Locked_ptr
Locked pointer class template

Locked_ptr_base

T
Locked_ptr

Locked_ptr(. . .)
operator →() : T *
operator *() : T &
valid() : bool

Template argument

T typename

Header
repos/base/include/base/weak_ptr.h

A locked pointer is constructed from a weak pointer. After construction, its validity
can (and should) be checked by calling the valid method. If the locked pointer is
valid, the pointed-to object is known to be locked until the locked pointer is destroyed.
During this time, the locked pointer can safely be de-referenced.
The typical pattern of using a locked pointer is to declare it as a local variable. Once
the execution leaves the scope of the variable, the locked pointer is destructed, which
unlocks the pointed-to weak object. It effectively serves as a lock guard.
Genode Locked_ptr Locked_ptr
constructor
Argument

weak_ptr Weak_ptr<T> &

Genode Locked_ptr operator →


method
Return value
T *

Genode Locked_ptr operator *


method
Return value
T &

397
8.9 Object lifetime management

Genode Locked_ptr valid


const method
Return value
bool True if the locked pointer is valid

Only if valid, the locked pointer can be de-referenced. Otherwise, the attempt will re-
sult in a null-pointer access.

398
8.9 Object lifetime management

8.9.2. Late and repeated object construction

The construct_at utility allows for the manual placement of objects without the need
to have a global placement new operation nor the need for type-specific new operators.

Genode
namespace
Utility for the manual placement of objects
Function
• Genode::construct_at(. . .) : T *
Header
repos/base/include/util/construct_at.h

Genode Genode::construct_at

Construct object of given type at a specific location function template

Template arguments

T typename
Object type
ARGS typename...

Arguments

at void *
Desired object location
args ARGS &&...
List of arguments for the object constructor

Return value
T * Typed object pointer

We use move semantics (ARGS &&) because otherwise the compiler would create a
temporary copy of all arguments that have a reference type and use a reference to this
copy instead of the original within this function.
There is a slight difference between the object that is constructed by this function and a
common object of the given type. If the destructor of the given type or of any base of the
given type is virtual, the vtable of the returned object references an empty delete(void
*) operator for that destructor. However, this shouldn’t be a problem as an object con-
structed by this function should never get destructed implicitely or through a delete
expression.
The Genode framework promotes a programming style that largely avoids dynamic
memory allocations. For the most part, higher-level objects aggregate lower-level ob-
jects as class members. This functional programming style leads to robust programs

399
8.9 Object lifetime management

but it poses a problem for programs that are expected to adopt their behaviour at run-
time. A way to selectively replace an aggregated object by a new version with up-
dated constructor arguments is desired. The Reconstructible utility solves this prob-
lem by wrapping an object of the type specified as template argument. In contrast
of a regular object, a Reconstructible object can be re-constructed any number of
times by calling construct with the constructor arguments. It is accompanied with a
so-called Constructible utility, which leaves the wrapped object unconstructed until
construct is called the first time.
Genode Reconstructible
Place holder for an object to be repeatedly constructed and destructed class template

MT
Reconstructible

Reconstructible(. . .)
Reconstructible(. . .)
~Reconstructible()
construct(. . .)
destruct()
constructed() : bool
conditional(. . .)
operator →() : MT *
operator →() : MT const *
operator *() : MT &
operator *() : MT const &
print(. . .)

Template argument

MT typename
Type

Accessors
operator → MT const *
operator * MT const &

Header
repos/base/include/util/reconstructible.h

This class template acts as a smart pointer that refers to an object contained within the
smart pointer itself. The contained object may be repeatedly constructed and destruc-
ted while staying in the same place. This is useful for replacing aggregated members
during the lifetime of a compound object.

400
8.9 Object lifetime management

Genode Reconstructible Reconstructible

Constructor that omits the initial construction of the object constructor

Argument

- Lazy *

Genode Reconstructible Reconstructible


constructor
Argument

args auto &&...

The arguments are forwarded to the constructor of the embedded object.


Genode Reconstructible construct

Construct new object in place method

Argument

args auto &&...

If the Reconstructible already hosts a constructed object, the old object will be de-
structed first.
Genode Reconstructible destruct

Destruct object method

Genode Reconstructible constructed


const method
Return value
bool True of volatile object contains a constructed object

Genode Reconstructible conditional

Construct or destruct volatile object according to condition method

Arguments

condition bool

args auto &&...

401
8.9 Object lifetime management

Genode Reconstructible operator →

Access contained object method

Return value
MT *

Genode Reconstructible operator *


method
Return value
MT &

Genode Reconstructible print


const method
Argument

out Output &

Genode Constructible
Reconstructible object that holds no initially constructed object class template

Reconstructible <MT>

MT
Constructible

Constructible()

Template argument

MT typename

Header
repos/base/include/util/reconstructible.h

402
8.10 Physical memory allocation

8.10. Physical memory allocation

Throughout Genode, physical memory is allocated in the form of RAM dataspaces by


using the Ram_allocator interface. This interface is implemented by the PD session
and thereby allows a component to use its RAM budget. The RAM dataspaces allo-
cated from the Ram_allocator interface may serve as backing store for fine-grained
component-local allocators such as the Heap (Section 8.11).

Genode Ram_allocator
class
Interface

Ram_allocator

try_alloc(. . .) : Alloc_result
alloc(. . .) : Ram_dataspace_capability
free(. . .)
dataspace_size(. . .) : size_t

Header
repos/base/include/base/ram_allocator.h

403
8.10 Physical memory allocation

Genode Ram_allocator try_alloc

Allocate RAM dataspace pure virtual method

Arguments

size size_t
Size of RAM dataspace
cache Cache
Selects cacheability attributes of the memory, uncached memory, i. e., for DMA
buffers
Default is CACHED

Return value
Alloc_result Capability to RAM dataspace, or error code of type Alloc_error

Genode Ram_allocator alloc

Allocate RAM dataspace method

Arguments

size size_t
Size of RAM dataspace
cache Cache
Selects cacheability attributes of the memory, uncached memory, i. e., for DMA
buffers
Default is CACHED

Exceptions

Out_of_ram
Out_of_caps
Denied

Return value
Ram_dataspace_capability Capability to new RAM dataspace

Genode Ram_allocator free

Free RAM dataspace pure virtual method

Argument

ds Ram_dataspace_capability
Dataspace capability as returned by alloc

404
8.10 Physical memory allocation

Genode Ram_allocator dataspace_size


pure virtual const method
Argument

- Ram_dataspace_capability

Return value
size_t Size of dataspace in bytes

Constraining RAM allocations The operations of the Ram_allocator interface are


the basis for Genode’s RAM accounting. In cases where a server needs to allocate RAM
on behalf of its clients, the interface provides a natural hook to track and constrain the
client-specific RAM usage. The Constrained_ram_allocator implements the inter-
face by forwarding all operations to another Ram_allocator instance while restricting
allocations to a quota limit. Exceeding the limit results in an Out_of_ram exception.

Genode Constrained_ram_allocator
Quota-bounds-checking wrapper of the Ram_allocator interface class

Ram_allocator

Constrained_ram_allocator

Constrained_ram_allocator(. . .)
try_alloc(. . .) : Alloc_result
free(. . .)
dataspace_size(. . .) : size_t

Header
repos/base/include/base/ram_allocator.h

Genode Constrained_ram_allocator Constrained_ram_allocator


constructor
Arguments

ram_alloc Ram_allocator &

ram_guard Ram_quota_guard &

cap_guard Cap_quota_guard &

405
8.11 Component-local allocators

8.11. Component-local allocators

Component-local allocators implement the generic Deallocator and Allocator inter-


faces. Allocators that operate on address ranges supplement the plain Allocator by
implementing the more specific Range_allocator interface.

Genode Deallocator
Deallocator interface class

Interface

Deallocator

free(. . .)
need_size_for_free() : bool

Header
repos/base/include/base/allocator.h

Genode Deallocator free

Free block a previously allocated block pure virtual method

Arguments

addr void *

size size_t

Genode Deallocator need_size_for_free


pure virtual const method
Return value
bool True if the size argument of free is required

The generic Allocator interface requires the caller of free to supply a valid size ar-
gument but not all implementations make use of this argument. If this method returns
false, it is safe to call free with an invalid size.
Allocators that rely on the size argument must not be used for constructing ob-
jects whose constructors may throw exceptions. See the documentation of operator
delete(void *, Allocator *) below for more details.

406
8.11 Component-local allocators

Genode Allocator
class
Deallocator

Allocator

~Allocator()
try_alloc(. . .) : Alloc_result
consumed() : size_t
overhead(. . .) : size_t
throw_alloc_error(. . .)
alloc(. . .) : void *

Header
repos/base/include/base/allocator.h

Genode Allocator try_alloc

Allocate block pure virtual method


Argument

size size_t
Block size to allocate

Return value
Alloc_result

Genode Allocator consumed


virtual const method
Return value
size_t Total amount of backing store consumed by the allocator

Genode Allocator overhead


pure virtual const method
Argument

size size_t

Return value
size_t Meta-data overhead per block

407
8.11 Component-local allocators

Genode Allocator throw_alloc_error

Raise exception according to the error value class function

Argument

error Alloc_error

Genode Allocator alloc

Allocate block and signal error as an exception method

Argument

size size_t
Block size to allocate

Exceptions

Out_of_ram
Out_of_caps
Denied

Return value
void * Pointer to the new block

Genode Range_allocator
class
Allocator

Range_allocator

~Range_allocator()
add_range(. . .) : Range_result
remove_range(. . .) : Range_result
alloc_aligned(. . .) : Alloc_result
alloc_aligned(. . .) : Alloc_result
alloc_addr(. . .) : Alloc_result
free(. . .)
free(. . .)
avail() : size_t
valid_addr(. . .) : bool

Header
repos/base/include/base/allocator.h

408
8.11 Component-local allocators

Genode Range_allocator add_range

Add free address range to allocator pure virtual method

Arguments

base addr_t

size size_t

Return value
Range_result

Genode Range_allocator remove_range

Remove address range from allocator pure virtual method

Arguments

base addr_t

size size_t

Return value
Range_result

Genode Range_allocator alloc_aligned

Allocate block pure virtual method


Arguments

size size_t
Size of new block
align unsigned
Alignment of new block specified as the power of two
range Range
Address-range constraint for the allocation

Return value
Alloc_result

409
8.11 Component-local allocators

Genode Range_allocator alloc_aligned

Allocate block without constraining the address range method

Arguments

size size_t

align unsigned

Return value
Alloc_result

Genode Range_allocator alloc_addr

Allocate block at address pure virtual method


Arguments

size size_t
Size of new block
addr addr_t
Desired address of block

Return value
Alloc_result

Genode Range_allocator free

Free a previously allocated block pure virtual method

Argument

addr void *

NOTE: We have to declare the Allocator::free(void *) method here as well to


make the compiler happy. Otherwise the C++ overload resolution would not find
Allocator::free(void *).
Genode Range_allocator avail
pure virtual const method
Return value
size_t The sum of available memory

Note that the returned value is not neccessarily allocatable because the memory may
be fragmented.

410
8.11 Component-local allocators

Genode Range_allocator valid_addr

Check if address is inside an allocated block pure virtual const method


Argument

addr addr_t
Address to check

Return value
bool True if address is inside an allocated block, false otherwise

411
8.11 Component-local allocators

8.11.1. Slab allocator

The Slab allocator is tailored for allocating small fixed-size memory blocks from a big
chunk of memory. For the common use case of using a slab allocator for a certain type
rather than for a known byte size, there exists a typed slab allocator as a front end of
Slab.
Genode Slab
Slab allocator class

Allocator

Slab

Slab(. . .)
~Slab()
overhead_per_block() : size_t
overhead_per_entry() : size_t
avail_entries() : size_t
insert_sb(. . .)
any_used_elem() : void *
free_empty_blocks()
try_alloc(. . .) : Alloc_result
free(. . .)
consumed() : size_t
overhead(. . .) : size_t
need_size_for_free() : bool

Accessors
consumed size_t
need_size_for_free bool

Header
repos/base/include/base/slab.h

412
8.11 Component-local allocators

Genode Slab Slab


constructor
Arguments

slab_size size_t

block_size size_t

initial_sb void *

backing_store Allocator *
Default is 0

Exceptions

Out_of_ram
Out_of_caps
Allocator::Denied Failed to obtain initial slab block

At construction time, there exists one initial slab block that is used for the first couple
of allocations, especially for the allocation of the second slab block.
Genode Slab overhead_per_block
class function
Return value
size_t

Genode Slab overhead_per_entry


class function
Return value
size_t

Genode Slab avail_entries


const method
Return value
size_t Number of unused slab entries

Genode Slab insert_sb

Add new slab block as backing store method

Argument

ptr void *

The specified ptr has to point to a buffer with the size of one slab block.

413
8.11 Component-local allocators

Genode Slab any_used_elem


method
Return value
void * A used slab element, or nullptr if empty

Genode Slab free_empty_blocks

Free memory of empty slab blocks method

Genode Slab try_alloc

Allocate slab entry method

Argument

size size_t

Return value
Alloc_result

The size parameter is ignored as only slab entries with preconfigured slab-entry size
are allocated.
Genode Tslab
class template
Slab T,
BLOCK_SIZE,
MIN_SLABS_PER_BLOCK
Tslab

static_assert(. . .)
Tslab(. . .)
Tslab(. . .)
first_object() : T *

Template arguments

T typename

BLOCK_SIZE size_t

MIN_SLABS_PER_BLOCK unsigned
Default is 8

Header
repos/base/include/base/tslab.h

414
8.11 Component-local allocators

Genode Tslab static_assert


constructor
Argument

- BLOCK_SIZE

Genode Tslab Tslab


constructor
Arguments

backing_store Allocator *

initial_sb void *
Default is 0

Genode Tslab Tslab


constructor
Arguments

backing_store Allocator &

initial_sb void *
Default is 0

Genode Tslab first_object


method
Return value
T *

415
8.11 Component-local allocators

8.11.2. AVL-tree-based best-fit allocator

In contrast to the rather limited slab allocators, Allocator_avl allows for arbitrary
allocations from a list of address regions. It implements a best-fit allocation strategy,
supports arbitrary alignments, and allocations at specified addresses.

Genode Allocator_avl_base
Interface of AVL-tree-based allocator class

Range_allocator

Allocator_avl_base

_block_tree() : Avl_tree<Block> const &


_revert_allocations_and_ranges()
_revert_unused_ranges() : bool
_find_by_address(. . .) : Block *
Allocator_avl_base(. . .)
~Allocator_avl_base()
any_block_addr(. . .) : bool
print(. . .)
add_range(. . .) : Range_result
remove_range(. . .) : Range_result
alloc_aligned(. . .) : Alloc_result
alloc_addr(. . .) : Alloc_result
free(. . .)
avail() : size_t
valid_addr(. . .) : bool

Accessors
_block_tree Avl_tree<Block> const &
avail size_t

Header
repos/base/include/base/allocator_avl.h

Genode Allocator_avl_base _revert_allocations_and_ranges

Clean up the allocator and detect dangling allocations method

This method is called at the destruction time of the allocator. It makes sure that the
allocator instance releases all memory obtained from the meta-data allocator.

416
8.11 Component-local allocators

Genode Allocator_avl_base _revert_unused_ranges


method
Return value
bool

Genode Allocator_avl_base _find_by_address

Find block by specified address const method

Arguments

addr addr_t

size size_t
Default is 0
check_overlap bool
Default is 0

Return value
Block *

Genode Allocator_avl_base Allocator_avl_base


constructor
Arguments

md_alloc Allocator *

md_entry_size size_t

This constructor can only be called from a derived class that provides an allocator for
block meta-data entries. This way, we can attach custom information to block meta
data.
Genode Allocator_avl_base any_block_addr
method
Argument

out_addr addr_t *
Result that contains address of block

Return value
bool True if block was found or false if there is no block available

If no block was found, out_addr is set to zero.

417
8.11 Component-local allocators

Genode Allocator_avl_base print


const method
Argument

out Output &

Genode Allocator_avl_tpl
AVL-based allocator with custom meta data attached to each block. class template

Allocator_avl_base
BMDT,
SLAB_BLOCK_SIZE
Allocator_avl_tpl

Allocator_avl_tpl(. . .)
~Allocator_avl_tpl()
slab_block_size() : size_t
metadata(. . .)
construct_metadata(. . .)
metadata(. . .) : BMDT*
add_range(. . .) : Range_result
apply_any(. . .) : bool

Template arguments

BMDT typename
Block meta-data type
SLAB_BLOCK_SIZE unsigned

Header
repos/base/include/base/allocator_avl.h

418
8.11 Component-local allocators

Genode Allocator_avl_tpl Allocator_avl_tpl


constructor
Argument

metadata_chunk_alloc Allocator *
Pointer to allocator used to allocate meta-data blocks. If set to
0, use ourself for allocating our meta-data blocks. This works
only if the managed memory is completely accessible by the
allocator.

Genode Allocator_avl_tpl slab_block_size


class function
Return value
size_t Size of slab blocks used for meta data

Genode Allocator_avl_tpl metadata

Assign custom meta data to block at specified address const method

Arguments

addr void *

bmd BMDT

Exception

Assign_metadata_failed

Genode Allocator_avl_tpl construct_metadata

Construct meta-data object in place method template

Template argument

ARGS typename...
Arguments passed to the meta-data constuctor

Arguments

addr void *

args ARGS &&...

419
8.11 Component-local allocators

Genode Allocator_avl_tpl metadata


const method
Argument

addr void *

Return value
BMDT* Meta data that was attached to block at specified address

Genode Allocator_avl_tpl apply_any

Apply functor fn to the metadata of an arbitrary member of the method


allocator. This method is provided for destructing each member
of the allocator. Calling the method repeatedly without removing
or inserting members will produce the same member.
Argument

fn auto const &

Return value
bool

420
8.11 Component-local allocators

8.11.3. Heap and sliced heap

Genode Heap
Heap that uses dataspaces as backing store class

Allocator

Heap

Heap(. . .)
Heap(. . .)
~Heap()
quota_limit(. . .) : int
reassign_resources(. . .)
for_each_region(. . .)
try_alloc(. . .) : Alloc_result
free(. . .)
consumed() : size_t
overhead(. . .) : size_t
need_size_for_free() : bool

Accessors
consumed size_t
need_size_for_free bool

Header
repos/base/include/base/heap.h

The heap class provides an allocator that uses a list of dataspaces of a RAM allocator
as backing store. One dataspace may be used for holding multiple blocks.

421
8.11 Component-local allocators

Genode Heap Heap


constructor
Arguments

ram_allocator Ram_allocator *

region_map Region_map *

quota_limit size_t
Default is UNLIMITED
static_addr void *
Default is 0
static_size size_t
Default is 0

Genode Heap Heap


constructor
Arguments

ram Ram_allocator &

rm Region_map &

Genode Heap quota_limit

Reconfigure quota limit method

Argument

new_quota_limit size_t

Return value
int Negative error code if new quota limit is higher than currently used quota.

Genode Heap reassign_resources

Re-assign RAM allocator and region map method

Arguments

ram Ram_allocator *

rm Region_map *

422
8.11 Component-local allocators

Genode Heap for_each_region

Call fn with the start and size of each backing-store region const method

Argument

fn auto const &

Genode Sliced_heap
Heap that allocates each block at a separate dataspace class

Allocator

Sliced_heap

meta_data_size() : size_t
Sliced_heap(. . .)
~Sliced_heap()
try_alloc(. . .) : Alloc_result
free(. . .)
consumed() : size_t
overhead(. . .) : size_t
need_size_for_free() : bool

Accessors
consumed size_t
need_size_for_free bool

Header
repos/base/include/base/heap.h

423
8.11 Component-local allocators

Genode Sliced_heap meta_data_size


class function
Return value
size_t Size of header prepended to each allocated block in bytes

Genode Sliced_heap Sliced_heap


constructor
Arguments

ram_alloc Ram_allocator &

region_map Region_map &

424
8.11 Component-local allocators

8.11.4. Bit allocator

Genode Bit_allocator
class template
BITS
Bit_allocator

Bit_allocator()
Bit_allocator(. . .)
alloc(. . .) : addr_t
alloc_addr(. . .)
free(. . .)

Template argument

BITS unsigned

Header
repos/base/include/util/bit_allocator.h

Genode Bit_allocator Bit_allocator


constructor
Argument

other Bit_allocator const &

Genode Bit_allocator alloc

Allocate block of bits method


Argument

num_log2 size_t const


2-based logarithm of size of block
Default is 0

Exception

Array::Out_of_indices

Return value
addr_t

The requested block is allocated at the lowest available index in the bit array.

425
8.11 Component-local allocators

Genode Bit_allocator alloc_addr

Allocate specific block of bits method

Arguments

bit_start addr_t const

num_log2 size_t const


2-based logarithm of size of block
Default is 0

Exceptions

Range_conflict
Array::Invalid_index_access

Genode Bit_allocator free


method
Arguments

bit_start addr_t const

num_log2 size_t const


Default is 0

426
8.11 Component-local allocators

427
8.12 String processing

8.12. String processing

8.12.1. Basic string operations

There exists a small set of string-manipulation operations as global functions in the


Genode namespace.

Genode
namespace
String utilities
Type

Byte_range_ptr is a subtype of Noncopyable


Data structure for describing a mutable byte buffer
The type is intended to be used as Byte_range_ptr const & argu-
ment.

Functions
• strlen(. . .) : size_t
• strcmp(. . .) : int
• memmove(. . .) : void *
• memcpy(. . .) : void *
• copy_cstring(. . .)
• memcmp(. . .) : int
• memset(. . .) : void *
• digit(. . .) : int
• is_letter(. . .) : bool
• is_digit(. . .) : bool
• is_whitespace(. . .) : bool
• ascii_to_unsigned(. . .) : size_t
• ascii_to(. . .) : size_t
• ascii_to(. . .) : size_t
• ascii_to(. . .) : size_t
• ascii_to(. . .) : size_t
• ascii_to(. . .) : size_t
• ascii_to(. . .) : size_t
• ascii_to_signed(. . .) : size_t
• ascii_to(. . .) : size_t
• ascii_to(. . .) : size_t
• ascii_to(. . .) : size_t
• ascii_to(. . .) : size_t
• unpack_string(. . .) : size_t

Header
repos/base/include/util/string.h

Genode strlen

428 global function


Argument

s const char *

Return value
size_t
8.12 String processing

Genode strcmp

Compare two strings global function

Arguments

s1 const char *

s2 const char *

len size_t
Maximum number of characters to compare, default is unlimited
Default is ~0UL

Return value
int 0 if both strings are equal, or a positive number if s1 is higher than s2, or a negative
number if s1 is lower than s2

Genode memmove

Copy memory buffer to a potentially overlapping destination global function


buffer
Arguments

dst void *
Destination memory block
src const void *
Source memory block
size size_t
Number of bytes to move

Return value
void * Pointer to destination memory block

429
8.12 String processing

Genode memcpy

Copy memory buffer to a non-overlapping destination buffer global function

Arguments

dst void *
Destination memory block
src const void *
Source memory block
size size_t
Number of bytes to copy

Return value
void * Pointer to destination memory block

Genode copy_cstring

Copy string global function

Arguments

dst char *
Destination buffer
src const char *
Buffer holding the null-terminated source string
size size_t
Maximum number of characters to copy

In contrast to the POSIX strncpy function, copy_cstring always produces a null-


terminated string in the dst buffer if the size argument is greater than 0.

430
8.12 String processing

Genode memcmp

Compare memory blocks global function

Arguments

p0 const void *

p1 const void *

size size_t

Return value
int 0 if both memory blocks are equal, or a negative number if p0 is less than p1, or a
positive number if p0 is greater than p1

Genode memset
global function
Arguments

dst void *

i uint8_t

size size_t

Return value
void *

Genode digit

Convert ASCII character to digit global function

Arguments

c char

hex bool
Consider hexadecimals
Default is false

Return value
int Digit or -1 on error

431
8.12 String processing

Genode is_letter
global function
Argument

c char

Return value
bool True if character is a letter

Genode is_digit
global function
Arguments

c char

hex bool
Default is false

Return value
bool True if character is a digit

Genode is_whitespace
global function
Argument

c char

Return value
bool True if character is whitespace

432
8.12 String processing

Genode ascii_to_unsigned

Read unsigned long value from string function template

Template argument

T typename

Arguments

s const char *
Source string
result T &
Destination variable
base uint8_t
Integer base

Return value
size_t Number of consumed characters

If the base argument is 0, the integer base is detected based on the characters in front of
the number. If the number is prefixed with “0x”, a base of 16 is used, otherwise a base
of 10.
Genode ascii_to

Read boolean value from string global function

Arguments

s char const *

result bool &

Return value
size_t Number of consumed characters

Genode ascii_to

Read unsigned char value from string global function

Arguments

s const char *

result unsigned char &

Return value
size_t Number of consumed characters

433
8.12 String processing

Genode ascii_to

Read unsigned short value from string global function

Arguments

s const char *

result unsigned short &

Return value
size_t Number of consumed characters

Genode ascii_to

Read unsigned long value from string global function

Arguments

s const char *

result unsigned long &

Return value
size_t Number of consumed characters

Genode ascii_to

Read unsigned long long value from string global function

Arguments

s const char *

result unsigned long

Return value
size_t Number of consumed characters

Genode ascii_to

Read unsigned int value from string global function

Arguments

s const char *

result unsigned int &

Return value
size_t Number of consumed characters

434
8.12 String processing

Genode ascii_to_signed

Read signed value from string function template

Template argument

T typename

Arguments

s const char *

result T &

Return value
size_t Number of consumed characters

Genode ascii_to

Read signed long value from string global function

Arguments

s const char *

result long &

Return value
size_t Number of consumed characters

Genode ascii_to

Read signed integer value from string global function

Arguments

s const char *

result int &

Return value
size_t Number of consumed characters

435
8.12 String processing

Genode ascii_to

Read Number_of_bytes value from string and handle the size global function
suffixes
Arguments

s const char *

result Number_of_bytes &

Return value
size_t Number of consumed characters

This function scales the resulting size value according to the suffixes for G (2ˆ30), M
(2ˆ20), and K (2ˆ10) if present.
Genode ascii_to

Read double float value from string global function

Arguments

s const char *

result double &

Return value
size_t Number of consumed characters

Genode unpack_string

Unpack quoted string global function

Arguments

src const char *


Source string including the quotation marks ("...”)
dst char *
Destination buffer
dst_len size_t const

Return value
size_t Number of characters or ~0UL on error

436
8.12 String processing

To cover the common case of embedding a string buffer as a member variable in a


class, there exists the String class template, which alleviates the need for C-style arrays
in such situations.
The String constructor takes any number of arguments, which will appear concate-
nated in the constructed String. Each argument must be printable as explained in
Section 8.12.3.
Genode String
Buffer that contains a null-terminated string class template

CAPACITY
String

size() : size_t
String()
String(. . .)
String(. . .)
String(. . .)
length() : size_t
capacity() : size_t
valid() : bool
string() : char const *
operator ==(. . .) : bool
operator !=(. . .) : bool
operator ==(. . .) : bool
operator !=(. . .) : bool
operator >(. . .) : bool
print(. . .)

Template argument

CAPACITY size_t
Buffer size including the terminating zero, must be higher than zero

Accessors
valid bool
string char const *

Header
repos/base/include/util/string.h

437
8.12 String processing

Genode String size


class function
Return value
size_t

Genode String String


constructor template
Template argument

T typename

Arguments

head T const &

tail auto &&...

The constructor accepts a non-zero number of arguments, which are concatenated in


the resulting String object. In order to generate the textual representation of the ar-
guments, the argument types must support the Output interface, e. g., by providing
print method.
If the textual representation of the supplied arguments exceeds CAPACITY, the resulting
string gets truncated. The caller may check for this condition by evaluating the length
of the constructed String. If length equals CAPACITY, the string may fit perfectly into
the buffer or may have been truncated. In general, it would be safe to assume the latter.
Genode String String
constructor
Argument

cstr char const *

Overload for the common case of constructing a String from a string literal.

438
8.12 String processing

Genode String String

Copy constructor constructor template

Template argument

N unsigned

Argument

other String<N> const &

Genode String length


const method
Return value
size_t Length of string, including the terminating null character

Genode String capacity


class function
Return value
size_t

Genode String operator ==


const method
Argument

other char const *

Return value
bool

Genode String operator !=


const method
Argument

other char const *

Return value
bool

439
8.12 String processing

Genode String operator ==


const method template
Template argument

OTHER_CAPACITY size_t

Argument

other String<OTHER_CAPACITY> const &

Return value
bool

Genode String operator !=


const method template
Template argument

OTHER_CAPACITY size_t

Argument

other String<OTHER_CAPACITY> const &

Return value
bool

Genode String operator >


const method template
Template argument

N size_t

Argument

other String<N> const &

Return value
bool

Genode String print


const method
Argument

out Output &

There exist a number of printable helper classes that cover typical use cases for pro-
ducing formatted text output.

440
8.12 String processing

Number_of_bytes wraps an integer value and produces an output suffixed with K, M,


or G whenever the value is a multiple of a kilobyte, megabyte, or gigabyte.

Cstring wraps a plain C character array to make it printable. There exist two con-
structors. The constructor with one argument expects a null-terminated character
array. The other constructor takes the number of to-be-printed characters as ar-
guments.

Hex wraps an integer value and produces hexadecimal output.

Char produces a character corresponding to the ASCII value the wrapped integer ar-
gument.

To improve safety in situations that require raw byte-wise access of memory, the two
utilities Byte_range_ptr and Const_byte_range_ptr hold a pointer together with a
size limit in bytes. They should be used instead of traditional C-style pairs of pointer
and size arguments to equip each pointer with its legitimate range of access. Note that
the utilities are meant for transient arguments only. They are deliberately not copyable
to prevent the accidental storing of the embedded pointer values.

441
8.12 String processing

8.12.2. Tokenizing

For parsing structured text such as argument strings or XML, simple tokenizing sup-
port is provided via the Token class.

Genode Token
Token class template

SCANNER_POLICY
Token

Token(. . .)
start() : char *
len() : size_t
type() : Type
string(. . .)
valid() : bool

Template argument

SCANNER_POLICY typename
Policy that defines the way of token scanning

Accessors
len size_t
type Type

Header
repos/base/include/util/token.h

This class is used to group characters of a string which belong to one syntactical token
types number, identifier, string, whitespace or another single character.
See Scanner_policy_identifier_with_underline for an example scanner policy.
Genode Token Token
constructor
Arguments

s const char *
Start of string to construct a token from
Default is 0
max_len size_t
Maximum token length
Default is ~0UL

The max_len argument is useful for processing character arrays that are not null-

442
8.12 String processing

terminated.
Genode Token start

Accessors const method


Return value
char *

Genode Token string


const method
Arguments

dst char *

max_len size_t

Genode Token valid


const method
Return value
bool True if token is valid

443
8.12 String processing

8.12.3. Diagnostic output

To enable components to produce diagnostic output like errors, warnings, and log mes-
sages, Genode offers a simple Output interface for sequentially writing single charac-
ters or character sequences.

Genode Output
Interface for textual output class

Interface

Output

out_char(. . .)
out_string(. . .)
out_args(. . .)
out_args(. . .)

Header
repos/base/include/base/output.h

Genode Output out_char

Output single character pure virtual method

Argument

- char

Genode Output out_string

Output string virtual method

Arguments

str char const *

n size_t
Maximum number of characters to output
Default is ~0UL

The output stops on the first occurrence of a null character in the string or after n char-
acters.
The default implementation uses out_char. This method may be overridden by the
backend for improving efficiency.

444
8.12 String processing

Genode Output out_args

Helper for the sequential output of a variable list of arguments class function

Arguments

output Output &

head auto &&

tail auto &&...

Genode Output out_args


class function
Arguments

output Output &

last auto &&

Functions for generating output for different types are named print and take an
Output & as first argument. The second argument is a const & to the value to print.
Overloads of the print function for commonly used basic types are provided. Fur-
thermore, there is a function template that is used if none of the type-specific over-
loads match. This function template expects the argument to be an object with a print
method. In contrast to a plain print function overload, such a method is able to incor-
porate private state into the output.

445
8.12 String processing

Genode
namespace
Interface for textual output
Functions
• print(. . .)
• print(. . .)
• print(. . .)
• print(. . .)
• print(. . .)
• print(. . .)
• print(. . .)
• print(. . .)
• print(. . .)
• print(. . .)
• print(. . .)
• print(. . .)
• print(. . .)
• print(. . .)
• print(. . .)
• print(. . .)
• print(. . .)
• print(. . .)
• print(. . .)

Header
repos/base/include/base/output.h

Genode print

Print null-terminated string global function

Arguments

output Output &

- char const *

Genode print

Disallow printing non-const character buffers global function

Arguments

- Output &

- char *

446
8.12 String processing

For char * types, it is unclear whether the argument should be printed as a pointer or
a string. The call must resolve this ambiguity by either casting the argument to void *
or wrapping it in a Cstring object.
Genode print

Print pointer value global function

Arguments

output Output &

- void const *

Genode print

Print arbitrary pointer types global function

Arguments

output Output &

ptr auto *

This function template takes precedence over the one that takes a constant object refer-
ence as argument.
Genode print

Print unsigned long value global function

Arguments

output Output &

long unsigned

Genode print

Print unsigned long long value global function

Arguments

output Output &

long unsigned long

447
8.12 String processing

Genode print
global function
Arguments

o Output &

v unsigned char

Genode print
global function
Arguments

o Output &

v unsigned short

Genode print
global function
Arguments

o Output &

v unsigned int

Genode print

Print signed long value global function

Arguments

output Output &

- long

Genode print

Print signed long long value global function

Arguments

output Output &

long long

Genode print
global function
Arguments

o Output &

v char

448
8.12 String processing

Genode print
global function
Arguments

o Output &

v short

Genode print
global function
Arguments

o Output &

v int

Genode print

Print bool value global function


Arguments

output Output &

value bool

Genode print

Print single-precision float global function

Arguments

output Output &

- float

Genode print

Print double-precision float global function

Arguments

output Output &

- double

449
8.12 String processing

Genode print

Print information about object obj global function

Arguments

output Output &

obj auto const &

The object type must provide a const print(Output &) method that produces the tex-
tual representation of the object.
In contrast to overloads of the Genode::print function, the T::print method is able to
access object-internal state, which can thereby be incorporated into the textual output.
Genode print

Print a variable number of arguments global function

Arguments

output Output &

head auto const &

tail auto &&...

The component’s execution environment provides an implementation of the Output


interface that targets a LOG session. This output back end is offered to the component
in the form of the log, warning, error, and trace functions that accept an arbitrary
number of arguments that are printed in a concatenated fashion. Each message is im-
plicitly finalized with a newline character.

450
8.12 String processing

Genode
namespace
LOG output functions
Functions
• log(. . .)
• warning(. . .)
• error(. . .)
• raw(. . .)
• trace(. . .)

Header
repos/base/include/base/log.h

Genode log

Write args as a regular message to the log global function

Argument

args auto &&...

Genode warning

Write args as a warning message to the log global function

Argument

args auto &&...

The message is automatically prefixed with “Warning: ". Please refer to the description
of the error function regarding the convention of formatting error/warning messages.
Genode error

Write args as an error message to the log global function

Argument

args auto &&...

The message is automatically prefixed with “Error: ". Hence, the message argument
does not need to additionally state that it is an error message. By convention, the actual
message should be brief, starting with a lower-case character.

451
8.12 String processing

Genode raw

Write args directly via the kernel (i. e., kernel debugger) global function

Argument

args auto &&...

This function is intended for temporarily debugging purposes only.


Genode trace

Write args to the trace buffer if tracing is enabled global function

Argument

args auto && ...

The message is prefixed with a timestamp value

8.12.4. Obtaining backtraces

As debugging aid, it is sometimes insightful to obtain call graphs of executed code.


Such backtraces can be generated via the utilities provided by os/backtrace.h. As a pre-
condition for getting useful output, make sure to have compiled your executable binary
with frame pointers. By adding the following line to the etc/build.conf file of the build
directory, one can instruct the build system to produce binaries in the needed form.

CC_OPT += -fno-omit-frame-pointer

The general mechanism for generating a backtrace has the form of the printable
Backtrace class. An object of this type can be passed to any of the log, warning, error,
or trace functions to output the backtrace of the point of call. As a convenient shortcut
for the common case of printing a backtrace to the log, one can call the backtrace()
function instead. The output looks like in the following example.

[init -> test-log] backtrace "ep"


[init -> test-log] 401ff89c 10014f4
[init -> test-log] 401ff90c 1001637
[init -> test-log] 401ff94c 10006e2
[init -> test-log] 401ffaec 5008aa9f
[init -> test-log] 401ffc6c 50048dbb
[init -> test-log] 401ffc8c 5004be41
[init -> test-log] 401ffcdc 5003a04d
[init -> test-log] 401ffe6c 50065218
[init -> test-log] 401fff7c 50079d54

452
8.12 String processing

The first line contains the thread name of the caller, which is followed by one line
per stack frame. The first number is the stack address whereas the second line is the
return address of the stack frame, both given in hexadecimal format. The latter can
be correlated to source code by inspecting the disassembled binary using the objdump
utility. In practice, however, one may prefer the convenience of the tool/backtrace utility
to streamline this procedure.

1. Execute the backtrace tool with the debug version of your executable as argument.
For example, after having observed a backtrace printed by the test-log program,
one may issue:

build/x86_64$ ../../tool/backtrace debug/test-log

2. Once started, the tool waits for you pasting the logged backtrace into the terminal.
For each stack frame, it then prints the corresponding function name and source-
code location.

453
8.12 String processing

8.12.5. Unicode handling

The string-handling utilities described in Section 8.12.1 operate on ASCII-encoded char-


acter strings where each character is encoded as one byte. It goes without saying that
ASCII is unsuitable for user-facing components that are ultimately expected to sup-
port the display of international characters. The Utf8_ptr utility accommodates such
components with an easy way to extract a sequence of Unicode codepoints from an
UTF-8-encoded string.

Genode Utf8_ptr
Wrapper around a char const pointer that is able to iterate over UTF-8 characters class

Utf8_ptr

Utf8_ptr(. . .)
Utf8_ptr(. . .)
operator =(. . .) : Utf8_ptr &
next() : Utf8_ptr const
complete() : bool
codepoint() : Codepoint
length() : unsigned

Header
repos/os/include/util/utf8.h

Note that this class is not a smart pointer. It is suffixed with _ptr to highlight the
fact that it stores a pointer while being copyable. Hence, objects of this type must be
handled with the same caution as pointers.
Genode Utf8_ptr Utf8_ptr
constructor
Argument

utf8 char const *


Null-terminated buffer containing UTF-8-encoded text

Genode Utf8_ptr Utf8_ptr


constructor
Argument

other Utf8_ptr const &

454
8.12 String processing

Genode Utf8_ptr operator =


method
Argument

other Utf8_ptr const &

Return value
Utf8_ptr &

Genode Utf8_ptr next


const method
Return value
Utf8_ptr const Next UTF-8 character

Genode Utf8_ptr complete


const method
Return value
bool True if string contains a complete UTF-8 sequence

This method solely checks for a premature truncation of the string. It does not check
the validity of the UTF-8 sequence. The success of complete method is a precondition
for the correct operation of the next or codepoint methods. A complete sequence may
still yield an invalid Codepoint.
Genode Utf8_ptr codepoint
const method
Return value
Codepoint Character as Unicode codepoint

Genode Utf8_ptr length


const method
Return value
unsigned Length of UTF-8 sequence in bytes

455
8.13 Multi-threading and synchronization

8.13. Multi-threading and synchronization

8.13.1. Threads

A thread is created by constructing an object of a class inherited from Thread. The new
thread starts its execution at the entry method. Thereby, each thread runs in the con-
text of its object and can access context-specific information by accessing its member
variables. This largely alleviates the need for a thread-local storage (TLS) mechanism.
Threads use a statically allocated stack, which is dimensioned according to the corre-
sponding constructor argument.

Genode Thread
Concurrent flow of control class

Thread

Thread(. . .)
Thread(. . .)
~Thread()
entry()
start()
name() : Name
alloc_secondary_stack(. . .) : void*
free_secondary_stack(. . .)
cap() : Thread_capability
native_thread() : Native_thread &
stack_top() : void *
stack_base() : void *
stack_virtual_size() : size_t
stack_area_virtual_base() : addr_t
stack_area_virtual_size() : size_t
myself() : Thread *
mystack() : Stack_info
stack_size(. . .)
utcb() : Native_utcb *
join()
trace(. . .)
trace_captured(. . .) : bool
trace(. . .)
trace(. . .)
affinity() : Affinity::Location

Header
repos/base/include/base/thread.h

456
8.13 Multi-threading and synchronization

A Thread object corresponds to a physical thread. The execution starts at the entry()
method as soon as start() is called.
Genode Thread Thread
constructor
Arguments

env Env &


Component environment
name Name const &
Thread name, used for debugging
stack_size size_t
Stack size
location Location
CPU affinity relative to the CPU-session’s affinity space
weight Weight
Scheduling weight relative to the other threads sharing the same CPU ses-
sion
cpu Cpu_session &

Exceptions

Stack_too_large
Stack_alloc_failed
Out_of_stack_space

The env argument is needed because the thread creation procedure needs to interact
with the environment for attaching the thread’s stack, the trace-control dataspace, and
the thread’s trace buffer and policy.
Genode Thread Thread
constructor
Arguments

env Env &

name Name const &

stack_size size_t

This is a shortcut for the common case of creating a thread via the environment’s CPU
session, at the default affinity location, and with the default weight.

457
8.13 Multi-threading and synchronization

Genode Thread entry

Entry method of the thread pure virtual method

Genode Thread start

Start execution of the thread virtual method

This method is virtual to enable the customization of threads used as server activation.
Genode Thread name

Request name of thread const method

Return value
Name

Genode Thread alloc_secondary_stack

Add an additional stack to the thread method


Arguments

name char const *

stack_size size_t

Exceptions

Stack_too_large
Stack_alloc_failed
Out_of_stack_space

Return value
void* Pointer to the new stack’s top

The stack for the new thread will be allocated from the RAM session of the component
environment. A small portion of the stack size is internally used by the framework for
storing thread-specific information such as the thread’s name.

458
8.13 Multi-threading and synchronization

Genode Thread free_secondary_stack

Remove a secondary stack from the thread method

Argument

stack_addr void*

Genode Thread cap

Request capability of thread const method

Return value
Thread_capability

Genode Thread native_thread


method
Return value
Native_thread & Kernel-specific thread meta data

Genode Thread stack_top


const method
Return value
void * Pointer just after first stack element

Genode Thread stack_base


const method
Return value
void * Pointer to last stack element

Genode Thread stack_virtual_size


class function
Return value
size_t Virtual size reserved for each stack within the stack area

Genode Thread stack_area_virtual_base


class function
Return value
addr_t The local base address of the stack area

459
8.13 Multi-threading and synchronization

Genode Thread stack_area_virtual_size


class function
Return value
size_t Total size of the stack area

Genode Thread myself


class function
Return value
Thread * Pointer to caller’s Thread object

Genode Thread mystack


class function
Return value
Stack_info Information about the current stack

Genode Thread stack_size

Ensure that the stack has a given size at the minimum method

Argument

size size_t const


Minimum stack size

Exceptions

Stack_too_large
Stack_alloc_failed

Genode Thread utcb


method
Return value
Native_utcb * User-level thread control block

Note that it is safe to call this method on the result of the myself class function. It han-
dles the special case of myself being 0 when called by the main thread during the
component initialization phase.
Genode Thread join

Block until the thread leaves the entry method method

Join must not be called more than once. Subsequent calls have undefined behaviour.

460
8.13 Multi-threading and synchronization

Genode Thread trace

Log null-terminated string as trace event class function

Argument

cstring char const *

Genode Thread trace_captured

Log null-terminated string as trace event using log_output policy class function

Argument

cstring char const *

Return value
bool True if trace is really put to buffer

Genode Thread trace

Log binary data as trace event class function

Arguments

data char const *

len size_t

Genode Thread trace

Log trace event as defined in base/trace/events.h class function

Argument

event auto const *

Genode Thread affinity

Thread affinity const method

Return value
Affinity::Location

461
8.13 Multi-threading and synchronization

8.13.2. Inter-thread synchronization

Genode provides three inter-thread synchronization primitives, namely Mutex, Blockade,


and Semaphore, which model different thread-synchronization situations. Under the
hood, they are based on the same low-level Lock primitive, which is visible at the API
but not recommended for direct use as it will potentially be removed from the API in a
later version.
Genode Mutex
Mutex primitive class

Mutex

Mutex()
acquire()
release()

Header
repos/base/include/base/mutex.h

Genode Mutex acquire


method

Genode Mutex release


method

Example of using a Mutex

Mutex mutex;
mutex.acquire();
mutex.release();

{
Mutex::Guard guard(mutex) /* acquire() during construction */
} /* release() on guard object destruction */

Mutex::Guard guard(mutex);
mutex.acquire(); /* <-- cause a warning about the deadlock */

462
8.13 Multi-threading and synchronization

Genode Blockade
Blockade primitive class

Blockade

Blockade()
block()
wakeup()

Header
repos/base/include/base/blockade.h

Genode Blockade block


method

Genode Blockade wakeup


method

463
8.13 Multi-threading and synchronization

Alongside the mutual exclusion of entering critical sections and the startup synchro-
nization of threads, producer-consumer relationships between threads are most com-
mon. The Semaphore enables the implementation of this synchronization scheme.

Genode Semaphore
Semaphore class

Semaphore

Semaphore(. . .)
~Semaphore()
up()
down()
cnt() : int

Header
repos/base/include/base/semaphore.h

Genode Semaphore Semaphore


constructor
Argument

n int
Initial counter value of the semphore
Default is 0

Genode Semaphore up

Increment semphore counter method

This method may wake up another thread that currently blocks on a down call at the
same semaphore.
Genode Semaphore down

Decrement semaphore counter, block if the counter reaches zero method

Genode Semaphore cnt


method
Return value
int Current semaphore counter

464
8.13 Multi-threading and synchronization

To synchronize method calls of an object, the Synced_interface can be used to


equip the class of the called object with thread safety.

Genode Synced_interface
class template
IF, LOCK
Synced_interface

Synced_interface(. . .)
operator ()() : Guard
operator ()() : Guard

Template arguments

IF typename

LOCK typename

Accessor
operator () Guard

Header
repos/base/include/base/synced_interface.h

Genode Synced_interface Synced_interface


constructor
Arguments

lock LOCK &

interface IF *

Genode Synced_interface operator ()


method
Return value
Guard

465
8.14 Signalling

8.14. Signalling

Section 3.6.2 provides the high-level description of the mechanism for the delivery of
asynchronous notifications (signals). The API defines interfaces for signal transmit-
ters and for the association of signal handlers with entrypoints. An entrypoint can be
associated with many signal handlers where each handler usually corresponds to a dif-
ferent signal source. Each signal handler is addressable via a distinct capability. Those
so-called signal-context capabilities can be delegated across component boundaries in
the same way as RPC-object capabilities. If a component is in possession of a signal-
context capability, it can trigger the corresponding signal handler by using a so-called
signal transmitter. The signal transmitter provides fire-and-forget semantics for signal
submission. Signals serve as mere notifications and cannot carry any payload.

Genode Signal_transmitter
Signal transmitter class

Signal_transmitter

Signal_transmitter(. . .)
context(. . .)
context() : Signal_context_capability
submit(. . .)

Header
repos/base/include/base/signal.h

Each signal-transmitter instance acts on behalf the context specified as constructor


argument. Therefore, the resources needed for the transmitter such as the consumed
memory sizeof(Signal_transmitter) should be accounted to the owner of the con-
text.

466
8.14 Signalling

Genode Signal_transmitter Signal_transmitter


constructor
Argument

context Signal_context_capability
Capability to signal context that is going to receive signals produced by the
transmitter
Default is Signal_context_capability()

Genode Signal_transmitter context

Set signal context method

Argument

context Signal_context_capability

Genode Signal_transmitter context


method
Return value
Signal_context_capability Signal context

Genode Signal_transmitter submit

Trigger signal submission to context method

Argument

cnt unsigned
Number of signals to submit at once
Default is 1

467
8.14 Signalling

Genode Signal_handler
Signal dispatcher for handling signals by an object method class template

Signal_dispatcher_base

T, -
Signal_handler

Signal_handler(. . .)
~Signal_handler()
dispatch(. . .)

Template arguments

T typename
Type of signal-handling class
- Entrypoint

Header
repos/base/include/base/signal.h

This utility associates an object method with signals. It is intended to be used as


a member variable of the class that handles incoming signals of a certain type. The
constructor takes a pointer-to-member to the signal-handling method as argument.
Genode Signal_handler Signal_handler
constructor
Arguments

ep EP &
Entrypoint managing this signal RPC
obj T &

Genode Signal_handler dispatch

Interface of Signal_dispatcher_base method

Argument

- unsigned

A Signal_handler object is meant to be hosted as a member of the class that also


contains a member function to be executed upon the arrival of a signal. Its constructor
takes the entrypoint, the signal-handling object, and a pointer to the handling func-

468
8.14 Signalling

tion as arguments. The following example illustrates the common pattern of using a
Signal_handler.

class Main
{
...
Entrypoint &_ep;
...
void _handle_config();

Signal_handler<Main> _config_handler =
{ _ep, *this, &Main::_handle_config };
...
};

In the example above, the _config_handler creates a signal-context capability for


the _handle_config method. In fact, the _config_handler is a capability since
the Signal_handler is derived from Signal_context_capability. Therefore, the
_config_handler can be directly passed as argument to an RPC call for registering a
signal handler at a server.

469
8.15 Remote procedure calls

8.15. Remote procedure calls

Section 3.6.1 provides the high-level description of synchronous remote procedure calls
(RPC).

8.15.1. RPC mechanism

The RPC mechanism consists of the following header files:

base/rpc.h Contains the basic type definitions and utility macros to declare RPC in-
terfaces.

base/rpc_args.h Contains definitions of non-trivial argument types used for trans-


ferring strings and binary buffers. Its use by RPC interfaces is optional.

base/rpc_server.h Contains the interfaces of the server-side RPC API. In particular,


this part of the API consists of the Rpc_object class template.

base/rpc_client.h Contains the API support for invoking RPC functions. It is comple-
mented by the definitions in base/capability.h. The most important elements of the
client-side RPC API are the Capability class template and Rpc_client, which is
a convenience wrapper around Capability.

Each RPC interface is an abstract C++ interface, supplemented by a few annotations.


For example:

#include <session/session.h>
#include <base/rpc.h>

namespace Hello { struct Session; }

struct Hello::Session : Genode::Session


{
static const char *service_name() { return "Hello"; }

virtual void say_hello() = 0;


virtual int add(int a, int b) = 0;

GENODE_RPC(Rpc_say_hello, void, say_hello);


GENODE_RPC(Rpc_add, int, add, int, int);
GENODE_RPC_INTERFACE(Rpc_say_hello, Rpc_add);
};

The macros GENODE_RPC and GENODE_RPC_INTERFACE are defined in base/rpc.h and


enrich the interface with type information. They are only used at compile time and

470
8.15 Remote procedure calls

have no effect on the run time or the size of the class. Each RPC function is represented
as a type. In the example, the type meta data of the say_hello function is attached to
the Rpc_say_hello type within the scope of Session. The macro arguments are:

GENODE_RPC(func_type, ret_type, func_name, arg_type ...)

The func_type argument is an arbitrary type name (except for the type name
Rpc_functions) used to refer to the RPC function, ret_type is the return type or
void, func_name is the name of the server-side function that implements the RPC func-
tion, and the list of arg_type arguments comprises the RPC function argument types.
The GENODE_RPC_INTERFACE macro defines a type called Rpc_functions that contains
the list of the RPC functions provided by the RPC interface.

Server side The implementation of the RPC interface inherits the Rpc_object class
template with the interface type as argument and implements the abstract RPC inter-
face class.
The server-side RPC dispatching is performed by the compile-time-generated dispatch
method of the Rpc_object class template, according to the type information found in
the annotations of the Session interface.
Genode Rpc_object_base
class
Object_pool <Rpc_object_base>

Rpc_object_base

~Rpc_object_base()
dispatch(. . .) : Rpc_exception_code

Header
repos/base/include/base/rpc_server.h

471
8.15 Remote procedure calls

Genode Rpc_object_base dispatch

Interface to be implemented by a derived class pure virtual method

Arguments

op Rpc_opcode
Opcode of invoked method
in Ipc_unmarshaller &
Incoming message with method arguments
out Msgbuf_base &
Outgoing message for storing method results

Return value
Rpc_exception_code

To make an instance of an RPC object invokable via RPC, it must be associated with
an RPC entrypoint. By passing an RPC object to the Entrypoint::manage method, a
new capability for the RPC object is created. The capability is tagged with the type of
the RPC object.
Most server-side RPC interfaces are session interfaces. In contrast to plain RPC ob-
jects (like a Region_map), all sessions carry a client-provided budget of RAM and ca-
pabilities, and are associated with a client-specific label. The Session_object class
template captures commonalities of this kind of RPC objects.

472
8.15 Remote procedure calls

Genode Session_object
class template
Rpc_object <RPC_INTERFACE, SERVER>
RPC_INTERFACE,
SERVER
Session_object

_ram_quota_guard() : Ram_quota_guard &


_cap_quota_guard() : Cap_quota_guard &
Session_object(. . .)
~Session_object()
session_quota_upgraded()
label() : Label
diag(. . .)
error(. . .)
warning(. . .)

Template arguments

RPC_INTERFACE typename

SERVER typename
Default is RPC_INTERFACE

Header
repos/base/include/base/session_object.h

Genode Session_object _ram_quota_guard


method
Return value
Ram_quota_guard &

Genode Session_object _cap_quota_guard


method
Return value
Cap_quota_guard &

473
8.15 Remote procedure calls

Genode Session_object Session_object


constructor
Arguments

ep Entrypoint &

resources Resources const &

label Label const &

diag Diag

Genode Session_object session_quota_upgraded

Hook called whenever the session quota was upgraded by the virtual method
client

Genode Session_object label


const method
Return value
Label Client-specific session label

Genode Session_object diag

Output label-prefixed diagnostic message conditionally const method

Argument

args auto &&...

The method produces output only if the session is in diagnostic mode (defined via the
diag session argument).

Genode Session_object error

Output label-prefixed error message const method

Argument

args auto &&...

Genode Session_object warning

Output label-prefixed error message const method

Argument

args auto &&...

474
8.15 Remote procedure calls

Client side At the client side, a capability can be invoked by the capability’s call
method template with the RPC functions type as template argument. The method ar-
guments correspond to the RPC function arguments.
By convention, the Capability::call method is rarely used directly. Instead, each
RPC interface is accompanied with a client-side implementation of the abstract in-
terface where each RPC function is implemented by calling the Capability::call
method. This convention is facilitated by the Rpc_client class template.

Genode Rpc_client
RPC client class template

RPC_INTERFACE

RPC_INTERFACE
Rpc_client

Rpc_client(. . .)

Template argument

RPC_INTERFACE typename

Header
repos/base/include/base/rpc_client.h

This class template is the base class of the client-side implementation of the specified
RPC_INTERFACE. Usually, it inherits the pure virtual functions declared in RPC_INTERFACE
and has the built-in facility to perform RPC calls to this particular interface. Hence,
the client-side implementation of each pure virtual interface function comes down to a
simple wrapper in the line of return call<Rpc_function>(arguments...).
Genode Rpc_client Rpc_client
constructor
Argument

cap Capability<RPC_INTERFACE> const &

Using this template, the client-side implementation of the example RPC interface
looks as follows.

475
8.15 Remote procedure calls

#include <hello_session/hello_session.h>
#include <base/rpc_client.h>

namespace Hello { struct Session_client; }

struct Hello::Session_client : Genode::Rpc_client<Session>


{
Session_client(Genode::Capability<Session> cap)
: Genode::Rpc_client<Session>(cap) { }

void say_hello()
{
call<Rpc_say_hello>();
}

int add(int a, int b)


{
return call<Rpc_add>(a, b);
}
};

For passing RPC arguments and results, the regular C++ type-conversion rules are
in effect. If there is no valid type conversion, or if the number of arguments is wrong,
or if the RPC type annotations are not consistent with the abstract interface, the error is
detected at compile time.

8.15.2. Transferable argument types

The arguments specified to GENODE_RPC behave mostly as expected for a normal func-
tion call. But there are some notable differences:

Value types Value types are supported for basic types and plain-old-data types (self-
sufficient structs or classes). The object data is transferred as such. If the type is
not self-sufficient (it contains pointers or references), the pointers and references
are transferred as plain data, most likely pointing to the wrong thing in the callee’s
address space.

Const references Const references behave like value types. The referenced object is
transferred to the server and a reference to the server-local copy is passed to the
server-side function. Note that in contrast to a normal function call that takes a
reference argument, the size of the referenced object is accounted for allocating
the message buffer on the client side.

476
8.15 Remote procedure calls

Non-const references Non-const references are handled similar to const references.


In addition the server-local copy gets transferred back to the caller so that server-
side modifications of the object become visible to the client.

Capabilities Capabilities can be transfered as values, const references, or non-const


references.

Variable-length buffers There exists special support for passing binary buffers to RPC
functions using the Rpc_in_buffer class template provided by base/rpc_args.h.
The maximum size of the buffer must be specified as template argument. An
Rpc_in_buffer object does not contain a copy of the data passed to the construc-
tor, only a pointer to the data. In contrast to a fixed-sized object containing a copy
of the payload, the RPC framework does not transfer the whole object but only
the actually used payload.

Pointers Pointers and const pointers are handled similar to references. The pointed-to
argument gets transferred and the server-side function is called with a pointer to
the local copy.

By default, all RPC arguments are input arguments, which are transferred to the server.
The return type of the RPC function, if present, is an output-only value. To avoid a
reference argument from acting as both input- and output argument, a const reference
should be used.

8.15.3. Throwing C++ exceptions across RPC boundaries

The propagation of C++ exceptions from the server to the client is supported by a spe-
cial variant of the GENODE_RPC macro:

GENODE_RPC_THROW(func_type, ret_type, func_name,


exc_type_list, arg_type ...)

This macro accepts an additional exc_type_list argument, which is a type list of


exception types. Exception objects are not transferred as payload. The RPC mechanism
propagates solely the information that the specific exception was raised. Hence, infor-
mation provided with the thrown object will be lost when crossing an RPC boundary.

8.15.4. RPC interface inheritance

It is possible to extend existing RPC interfaces with additional RPC functions. Inter-
nally, such an RPC interface inheritance is realized by concatenation of the Rpc_functions
type lists of both the base interface and the derived interface. This use case is supported
by a special version of the GENODE_RPC_INTERFACE macro:

477
8.15 Remote procedure calls

GENODE_RPC_INTERFACE_INHERIT(base_interface,
rpc_func ...)

8.15.5. Casting capability types

For typed capabilities, the same type conversion rules apply as for pointers. In fact, a
typed capability pretty much resembles a typed pointer, pointing to a remote object.
Hence, assigning a specialized capability (e. g., Capability<Event::Session>) to a
base-typed capability (e. g., Capability<Session>) is always valid. For the opposite
case, a static cast is needed. For capabilities, this cast is supported by

static_cap_cast<INTERFACE>(cap)

In rare circumstances, mostly in platform-specific base code, a reinterpret cast for


capabilities is required. It allows to convert any capability to another type:

reinterpret_cap_cast<INTERFACE>(cap)

8.15.6. Non-virtual RPC interface functions

It is possible to declare RPC functions using GENODE_RPC, which do not exist as virtual
functions in the abstract interface class. In this case, the function name specified as third
argument to GENODE_RPC is of course not valid for the interface class but an alternative
class can be specified as second argument to the server-side Rpc_object. This way, a
server-side implementation may specify its own class to direct the RPC function to a
local (possibly non-virtual) implementation.

8.15.7. Limitations of the RPC mechanism

The maximum number of RPC function arguments is limited to 7. If a function re-


quires more arguments, it may be worthwhile to consider grouping some of them in a
compound struct.

478
8.15 Remote procedure calls

8.15.8. Root interface

Each service type is represented as an RPC object implementing the root interface. The
server announces its service type by providing the service name and the capability
of the service’s root interface (announce function of the parent interface). Given the
capability to the root interface, the parent is then able to create and destroy sessions.

Genode Root
Root interface class

Root

~Root()
RPC interface
session(. . .) : Session_capability
upgrade(. . .)
close(. . .)

Header
repos/base/include/root/root.h

Genode Root session

Create session pure virtual method


Arguments

args Session_args const &

affinity Affinity const &

Exceptions

Insufficient_ram_quota
Insufficient_cap_quota
Service_denied

Return value
Session_capability Capability to new session

Genode Root upgrade

Extend resource donation to an existing session pure virtual method

Arguments

session Session_capability

args Upgrade_args const &

479
8.15 Remote procedure calls

Genode Root close

Close session pure virtual method


Argument

session Session_capability

Genode Typed_root
class template
Root interface supplemented with information about the managed session type

Root

SESSION_TYPE
Typed_root

Template argument

SESSION_TYPE typename

Header
repos/base/include/root/root.h

This class template is used to automatically propagate the correct session type to
Parent::announce() when announcing a service.
Because defining root interfaces for services follows a recurring pattern, there exists
default template classes that implement the standard behaviour of the root interface
for services with multiple clients (Root_component) and services with a single client
(Static_root).

480
8.15 Remote procedure calls

Genode Root_component
Template for implementing the root interface class template

Rpc_object <Typed_root<SESSION_TYPE> > Local_service <SESSION_TYPE>


SESSION_TYPE,
POLICY
Root_component

_create_session(. . .) : SESSION_TYPE *
_create_session(. . .) : SESSION_TYPE *
_upgrade_session(. . .)
_destroy_session(. . .)
md_alloc() : Allocator *
ep() : Rpc_entrypoint *
Root_component(. . .)
create(. . .) : SESSION_TYPE &
upgrade(. . .)
destroy(. . .)
session(. . .) : Session_capability
upgrade(. . .)
close(. . .)

Template arguments

SESSION_TYPE typename
Session-component type to manage, derived from Rpc_object
POLICY typename
Session-creation policy

Header
repos/base/include/root/component.h

The POLICY template parameter allows for constraining the session creation to only
one instance at a time (using the Single_session policy) or multiple instances (using
the Multiple_sessions policy).
The POLICY class must provide the following two methods:
’aquire(const char *args)’ is called with the session arguments at creation time of each
new session. It can therefore implement a session-creation policy taking session argu-
ments into account. If the policy denies the creation of a new session, it throws one of
the exceptions defined in the Root interface.
’release’ is called at the destruction time of a session. It enables the policy to keep track
of and impose restrictions on the number of existing sessions.
The default policy Multiple_clients imposes no restrictions on the creation of new
sessions.

481
8.15 Remote procedure calls

Genode Root_component _create_session

Create new session (to be implemented by a derived class) virtual method

Arguments

args const char *

- Affinity const &

Exceptions

Out_of_ram
Out_of_caps
Service_denied
Insufficient_cap_quota
Insufficient_ram_quota

Return value
SESSION_TYPE *

Only a derived class knows the constructor arguments of a specific session. Therefore,
we cannot unify the call of its new operator and must implement the session creation at
a place, where the required knowledge exist.
In the implementation of this method, the heap, provided by Root_component must be
used for allocating the session object.
If the server implementation does not evaluate the session affinity, it suffices to override
the overload without the affinity argument.
Genode Root_component _create_session
virtual method
Argument

- const char *

Return value
SESSION_TYPE *

Genode Root_component _upgrade_session

Inform session about a quota upgrade virtual method

Arguments

- SESSION_TYPE *

- const char *

Once a session is created, its client can successively extend its quota donation via the
Parent::transfer_quota operation. This will result in the invokation of Root::upgrade

482
8.15 Remote procedure calls

at the root interface the session was created with. The root interface, in turn, informs
the session about the new resources via the _upgrade_session method. The default
implementation is suited for sessions that use a static amount of resources accounted
for at session-creation time. For such sessions, an upgrade is not useful. However,
sessions that dynamically allocate resources on behalf of its client, should respond to
quota upgrades by implementing this method.
Genode Root_component _destroy_session
virtual method
Argument

session SESSION_TYPE *

Genode Root_component md_alloc


method
Return value
Allocator * Allocator to allocate server object in _create_session()

Genode Root_component ep
method
Return value
Rpc_entrypoint * Entrypoint that serves the root component

Genode Root_component Root_component


constructor
Arguments

ep Entrypoint &
Entry point that manages the sessions of this root interface
md_alloc Allocator &
Meta-data allocator providing the backing store for session objects

483
8.15 Remote procedure calls

8.15.9. Server-side policy handling

The Session_label and Session_policy utilities aid the implementation of the


server-side policy-selection mechanism described in Section 4.6.2.

Genode Session_label
Session label utility class class

String <160>

Session_label

Session_label(. . .)
last_element() : Session_label
prefix() : Session_label

Accessor
last_element Session_label

Header
repos/base/include/base/session_label.h

Genode Session_label Session_label

Copy constructor constructor template

Template argument

N size_t

Argument

other String<N> const &

This constructor is needed because GCC 8 disregards derived copy constructors as can-
didate.
Genode Session_label prefix
const method
Return value
Session_label Part of the label without the last element

484
8.15 Remote procedure calls

Genode Session_policy
Query server-side policy for a session request class

Xml_node

Session_policy

Session_policy(. . .)

Header
repos/os/include/os/session_policy.h

Genode Session_policy Session_policy


constructor template
Template argument

N size_t

Arguments

label String<N> const &


Label used as the selector of a policy
config Xml_node
XML node that contains the policies as sub nodes

Exception

No_policy_defined The server configuration has no policy defined for the specified
label

On construction, the Session_policy looks up the policy XML node that matches the
label provided as argument. The server-side policies are defined in one or more policy
subnodes of the server’s config node. Each policy node has a label attribute. If the
policy label matches the first part of the label as delivered as session argument, the
policy matches. If multiple policies match, the one with the longest label is selected.

485
8.15 Remote procedure calls

8.15.10. Packet stream

The Packet_stream is the building block of the asynchronous transfer of bulk data
(Section 3.6.6). The public interface consists of the two class templates Packet_stream_source,
and Packet_stream_sink. Both communication parties agree on a policy with regard
to the organization of the communication buffer by specifying the same Packet_stream_policy
as template argument.
The communication buffer consists of three parts, a submit queue, an acknowledge-
ment queue, and a bulk buffer. The submit queue contains packets generated by the
source to be processed by the sink. The acknowledgement queue contains packets that
are processed and acknowledged by the sink. The bulk buffer contains the actual pay-
load. The assignment of packets to bulk-buffer regions is performed by the source.

Genode Packet_descriptor
Default packet descriptor class

Packet_descriptor

Packet_descriptor(. . .)
Packet_descriptor()
offset() : Genode::off_t
size() : Genode::size_t

Accessors
offset Genode::off_t
size Genode::size_t

Header
repos/os/include/os/packet_stream.h

A class used as PACKET_DESCRIPTOR arguments to the Packet_stream_policy tem-


plate must implement the interface of this class.

486
8.15 Remote procedure calls

Genode Packet_descriptor Packet_descriptor


constructor
Arguments

offset off_t

size size_t

Genode Packet_descriptor Packet_descriptor

Default constructor used for instantiating arrays of packet- constructor


descriptors used as submit and ack queues.

Genode Packet_stream_base
Common base of Packet_stream_source and Packet_stream_sink class

Packet_stream_base

Header
repos/os/include/os/packet_stream.h

Genode Packet_stream_source
Originator of a packet stream class template

POLICY
Packet_stream_source

Template argument

POLICY typename

Header
repos/os/include/os/packet_stream.h

487
8.15 Remote procedure calls

Genode Packet_stream_sink
Receiver of a packet stream class template

POLICY
Packet_stream_sink

Template argument

POLICY typename

Header
repos/os/include/os/packet_stream.h

Genode Packet_stream_policy
Policy used by both sides source and sink class template

PACKET_DESCRIPTOR,
SUBMIT_QUEUE_SIZE,
ACK_QUEUE_SIZE,
CONTENT_TYPE
Packet_stream_policy

Template arguments

PACKET_DESCRIPTOR typename

SUBMIT_QUEUE_SIZE unsigned

ACK_QUEUE_SIZE unsigned

CONTENT_TYPE typename

Header
repos/os/include/os/packet_stream.h

488
8.15 Remote procedure calls

In client-server scenarios, each client and server can play the role of either a source
or a sink of packets. To ease the use of packet streams in such scenarios, the classes
within the Packet_stream_rx and Packet_stream_tx namespaces provide ready-to-
use building blocks to be aggregated in session interfaces.
Data transfer from server to client
Packet_stream_rx Channel
class template
Interface

PACKET_STREAM_POLICY
Packet_stream_rx::Channel

sink() : Sink *
sigh_ready_to_ack(. . .)
sigh_packet_avail(. . .)

Template argument

PACKET_STREAM_POLICY typename

Header
repos/os/include/packet_stream_rx/packet_stream_rx.h

Packet_stream_rx Channel sink

Request reception interface virtual method

Return value
Sink *

See documentation of Packet_stream_tx::Cannel::source.


Packet_stream_rx Channel sigh_ready_to_ack

Register signal handler for receiving ready_to_ack signals pure virtual method

Argument

sigh Signal_context_capability

Packet_stream_rx Channel sigh_packet_avail

Register signal handler for receiving packet_avail signals pure virtual method

Argument

sigh Signal_context_capability

489
8.15 Remote procedure calls

Packet_stream_rx Client
class template
Rpc_client <CHANNEL>

CHANNEL
Packet_stream_rx::Client

Client(. . .)
sigh_ready_to_ack(. . .)
sigh_packet_avail(. . .)

Template argument

CHANNEL typename

Header
repos/os/include/packet_stream_rx/client.h

Packet_stream_rx Client Client


constructor
Arguments

channel_cap Capability<CHANNEL>

rm Region_map &

490
8.15 Remote procedure calls

Data transfer from client to server


Packet_stream_tx Channel
class template
Interface

PACKET_STREAM_POLICY
Packet_stream_tx::Channel

source() : Source *
sigh_ready_to_submit(. . .)
sigh_ack_avail(. . .)

Template argument

PACKET_STREAM_POLICY typename

Header
repos/os/include/packet_stream_tx/packet_stream_tx.h

Packet_stream_tx Channel source

Request transmission interface virtual method

Return value
Source *

This method enables the client-side use of the Channel using the abstract Channel in-
terface only. This is useful in cases where both source and sink of the Channel are
co-located in one program. At the server side of the Channel, this method has no
meaning.
Packet_stream_tx Channel sigh_ready_to_submit

Register signal handler for receiving ready_to_submit signals pure virtual method

Argument

sigh Signal_context_capability

Packet_stream_tx Channel sigh_ack_avail

Register signal handler for receiving ack_avail signals pure virtual method

Argument

sigh Signal_context_capability

491
8.15 Remote procedure calls

Packet_stream_tx Client
class template
Rpc_client <CHANNEL>

CHANNEL
Packet_stream_tx::Client

Client(. . .)
sigh_ready_to_submit(. . .)
sigh_ack_avail(. . .)

Template argument

CHANNEL typename

Header
repos/os/include/packet_stream_tx/client.h

Packet_stream_tx Client Client


constructor
Arguments

channel_cap Capability<CHANNEL>

rm Region_map &

buffer_alloc Range_allocator &


Allocator used for managing the transmission buffer

492
8.16 XML processing

8.16. XML processing

The configuration concept (Chapter 6 and Section 4.6) of the framework relies on XML
syntax. Hence, there is the need to process XML-formed data. For parsing XML data,
the Xml_node and the accompanying Xml_attribute utilities are provided. Those util-
ities operate directly on a text buffer that contains XML data. There is no conversion
step into an internal representation. This approach alleviates the need for allocating any
meta data while extracting information from XML data. The XML parser is stateless.
Vice versa, the Xml_generator serves as a utility for generating XML formatted data.
The scope of an XML node is represented by a lambda function. Hence, nested XML
nodes can be created by nested lambda functions, which makes the structure of the
XML data immediately apparent in the C++ source code. As for the Xml_node the
Xml_generator does not use any internal intermediate representation of the XML data.
No dynamic memory allocations are needed while generating XML-formatted output.
A typical component imports parts of its internal state from XML input, most promi-
nently its configuration. This import is not a one-off operation but may occur multiple
times during the lifetime of the component. Hence, the component is faced with the
challenge of updating its internal data model from potentially changing XML input.
The List_model provides a convenient and robust formalism to implement such par-
tial model updates.

8.16.1. XML parsing

Genode’s XML parser consists of the two classes Xml_node and Xml_attribute. Its
primary use case is the provisioning of configuration information to low-level compo-
nents. Consequently, it takes the following considerations into account:

Low complexity Because the parser is implicitly used in most components, it must
not be complex to keep its footprint on the trusted computing base as small as
possible.

Free-standing The parser must be able to operate without external dependencies such
as a C runtime. Otherwise, each Genode-based system would inherit such depen-
dencies.

No dynamic memory allocations The parser should not dynamically allocate mem-
ory to be usable in resource multiplexers and runtime environments where no
anonymous memory allocations are allowed (Section 3.3.3).

Robustness The parser must be robust in the sense that it must not contain buffer
overflows, infinite loops, memory corruptions, or any other defect that may make
the program crash.

Other possible goals like expressive error messages, the support for more general use
cases, and even the adherence to standards are deliberately subordinated. Given its low

493
8.16 XML processing

complexity, the XML parser cannot satisfy components that need advanced XML pro-
cessing such as validating XML data against a DTD or schema, mutating XML nodes,
or using different character encodings. In such cases, component developers may con-
sider the use of ported 3rd-party XML parser.

Genode Xml_attribute
Representation of an XML-node attribute class

Xml_attribute

name() : Name
has_type(. . .) : bool
value_size() : size_t
has_value(. . .) : bool
with_raw_value(. . .)
value(. . .) : bool
value(. . .)
next() : Xml_attribute

Accessor
name Name

Header
repos/base/include/util/xml_node.h

An attribute has the form name=“value”.


Genode Xml_attribute has_type
method
Argument

type char const *

Return value
bool True if attribute has specified type

Genode Xml_attribute value_size


const method
Return value
size_t Size of the value in bytes

494
8.16 XML processing

Genode Xml_attribute has_value


const method
Argument

value char const *

Return value
bool True if attribute has the specified value

Genode Xml_attribute with_raw_value

Call functor fn with the data of the attribute value as argument const method

Argument

fn auto const &

The functor is called with the start pointer (char const *) and size (size_t) of the at-
tribute value as arguments.
Note that the content of the buffer is not null-terminated but delimited by the size
argument.
Genode Xml_attribute value
const method
Argument

out auto &

Return value
bool True on success, or false if attribute is invalid or value conversion failed

Genode Xml_attribute value


const method template
Template argument

N size_t

Argument

out String<N> &

Genode Xml_attribute next


const method
Return value
Xml_attribute Next attribute in attribute list

495
8.16 XML processing

Genode Xml_node
Representation of an XML node class

Xml_node

Xml_node(. . .)
size() : size_t
content_size() : size_t
type() : Type
has_type(. . .) : bool
with_raw_node(. . .)
with_raw_content(. . .)
decoded_content(. . .) : size_t
decoded_content() : STRING
num_sub_nodes() : size_t
next() : Xml_node
next(. . .) : Xml_node
last(. . .) : bool
sub_node(. . .) : Xml_node
sub_node(. . .) : Xml_node
with_optional_sub_node(. . .)
with_sub_node(. . .)
for_each_sub_node(. . .)
for_each_sub_node(. . .)
attribute(. . .) : Xml_attribute
attribute(. . .) : Xml_attribute
attribute_value(. . .) : T
has_attribute(. . .) : bool
for_each_attribute(. . .)
has_sub_node(. . .) : bool
print(. . .)
differs_from(. . .) : bool

Accessor
type Type

Header
repos/base/include/util/xml_node.h

496
8.16 XML processing

Genode Xml_node Xml_node


constructor
Arguments

addr char const *

max_len size_t
Default is ~0UL

Exception

Invalid_syntax

The constructor validates if the start tag has a matching end tag of the same depth and
counts the number of immediate sub nodes.
Genode Xml_node size
const method
Return value
size_t Size of node including start and end tags in bytes

Genode Xml_node content_size


const method
Return value
size_t Size of node content

Genode Xml_node has_type


const method
Argument

type char const *

Return value
bool True if tag is of specified type

Genode Xml_node with_raw_node

Call functor fn with the node data (char const *, size_t) const method
Argument

fn auto const &

497
8.16 XML processing

Genode Xml_node with_raw_content

Call functor fn with content (char const *, size_t) as const method


argument

Argument

fn auto const &

Note that the content is not null-terminated. It points directly into a sub range of the
unmodified Xml_node data.
If the node has no content, the functor fn is not called.
Genode Xml_node decoded_content

Export decoded node content from XML node const method

Arguments

dst char *
Destination buffer
dst_len size_t
Size of destination buffer in bytes

Return value
size_t Number of bytes written to the destination buffer

This function transforms XML character entities into their respective characters.
Genode Xml_node decoded_content

Read decoded node content as Genode::String const method template

Template argument

STRING typename

Return value
STRING

Genode Xml_node num_sub_nodes


const method
Return value
size_t The number of the XML node’s immediate sub nodes

498
8.16 XML processing

Genode Xml_node next


const method
Exception

Nonexistent_sub_node Subsequent node does not exist

Return value
Xml_node XML node following the current one

Genode Xml_node next


const method
Argument

type char const *


Type of XML node, or nullptr for matching any type

Exception

Nonexistent_sub_node Subsequent node does not exist

Return value
Xml_node Next XML node of specified type

Genode Xml_node last


const method
Argument

type char const *


Default is nullptr

Return value
bool True if node is the last of a node sequence

Genode Xml_node sub_node


const method
Argument

idx unsigned
Index of sub node, default is the first node
Default is 0U

Exception

Nonexistent_sub_node No such sub node exists

Return value
Xml_node Sub node with specified index

499
8.16 XML processing

Genode Xml_node sub_node


const method
Argument

type char const *

Exception

Nonexistent_sub_node No such sub node exists

Return value
Xml_node First sub node that matches the specified type

Genode Xml_node with_optional_sub_node

Apply functor fn to first sub node of specified type const method

Arguments

type char const *

fn auto const &

The functor is called with the sub node as argument. If no matching sub node exists,
the functor is not called.
Genode Xml_node with_sub_node

Apply functor fn to first sub node of specified type const method

Arguments

type char const *

fn auto const &

missing_fn auto const &

The functor is called with the sub node as argument. If no matching sub node exists,
the functor missing_fn is called.

500
8.16 XML processing

Genode Xml_node for_each_sub_node

Execute functor fn for each sub node of specified type const method

Arguments

type char const *

fn auto const &

Genode Xml_node for_each_sub_node

Execute functor fn for each sub node const method


Argument

fn auto const &

Genode Xml_node attribute


const method
Argument

idx unsigned
Attribute index, first attribute has index 0

Exception

Nonexistent_attribute No such attribute exists

Return value
Xml_attribute XML attribute

Genode Xml_node attribute


const method
Argument

type char const *


Name of attribute type

Exception

Nonexistent_attribute No such attribute exists

Return value
Xml_attribute XML attribute

501
8.16 XML processing

Genode Xml_node attribute_value

Read attribute value from XML node const method template


Template argument

T typename

Arguments

type char const *


Attribute name
default_value T const
Value returned if no attribute with the name type is present.

Return value
T Attribute value or specified default value

The type of the return value corresponds to the type of the default value.
Genode Xml_node has_attribute
const method
Argument

type char const *

Return value
bool True if attribute of specified type exists

Genode Xml_node for_each_attribute

Execute functor fn for each attribute const method


Argument

fn auto const &

Genode Xml_node has_sub_node


const method
Argument

type char const *

Return value
bool True if sub node of specified type exists

502
8.16 XML processing

Genode Xml_node print


const method
Argument

output Output &

Genode Xml_node differs_from


const method
Argument

other Xml_node const &

Return value
bool True if this node differs from another

Genode Xml_unquoted
Utility for unquoting XML attributes class

Xml_unquoted

Xml_unquoted(. . .)
print(. . .)

Header
repos/base/include/util/xml_node.h

The Xml_unquoted utility can be used to revert quoted XML attribute values. Such
quoting is needed whenever an attribute value can contain " characters.

503
8.16 XML processing

Genode Xml_unquoted Xml_unquoted


constructor template
Template argument

N size_t

Argument

string String<N> const &

Genode Xml_unquoted print


const method
Argument

out Output &

504
8.16 XML processing

8.16.2. XML generation

Genode Xml_generator
Utility for generating XML class

Xml_generator

Xml_generator(. . .)
node(. . .)
node(. . .)
attribute(. . .)
attribute(. . .)
attribute(. . .)
attribute(. . .)
attribute(. . .)
attribute(. . .)
attribute(. . .)
attribute(. . .)
attribute(. . .)
attribute(. . .)
append(. . .)
append_sanitized(. . .)
append_content(. . .)
used() : size_t

Accessor
used size_t

Header
repos/base/include/util/xml_generator.h

505
8.16 XML processing

Genode Xml_generator Xml_generator


constructor
Arguments

dst char *

dst_len size_t

name char const *

fn auto const &

Genode Xml_generator node


method
Argument

name char const *

Genode Xml_generator node


method
Argument

name char const *

Genode Xml_generator attribute


method
Arguments

name char const *

str char const *

Genode Xml_generator attribute


method template
Template argument

N size_t

Arguments

name char const *

str String<N> const &

506
8.16 XML processing

Genode Xml_generator attribute


method
Arguments

name char const *

value bool

Genode Xml_generator attribute


method
Arguments

name char const *

value long

Genode Xml_generator attribute


method
Arguments

name char const *

value long

Genode Xml_generator attribute


method
Arguments

name char const *

value int

Genode Xml_generator attribute


method
Arguments

name char const *

value unsigned long

Genode Xml_generator attribute


method
Arguments

name char const *

value unsigned long

507
8.16 XML processing

Genode Xml_generator attribute


method
Arguments

name char const *

value unsigned

Genode Xml_generator attribute


method
Arguments

name char const *

value double

Genode Xml_generator append

Append content to XML node method

Arguments

str char const *

str_len size_t
Default is ~0UL

This method must not be followed by calls of attribute.


Genode Xml_generator append_sanitized

Append sanitized content to XML node method

Arguments

str char const *

str_len size_t
Default is ~0UL

This method must not be followed by calls of attribute.


Genode Xml_generator append_content

Append printable objects to XML node as sanitized content method

Argument

args auto &&...

This method must not be followed by calls of attribute.

508
8.16 XML processing

8.16.3. XML-based data models

List-based data model created and updated from XML The List_model defined
at base/include/util/list_model.h stores a component-internal representation of XML-node
content. It keeps XML sub-nodes of a matching type in an internal list of C++ objects of
type ELEM. An ELEM type carries two methods matches and type_matches that define
the relation of the elements to XML nodes. E.g.,

struct Item : List_model<Item>::Element


{
static bool type_matches(Xml_node const &);

bool matches(Xml_node const &) const;


...
};

The class function type_matches returns true if the specified XML node matches the
Item type. It can thereby be used to control the creation of ELEM nodes by responding
to specific XML tags while ignoring unrelated XML tags.
The matches method returns true if the concrete element instance matches the given
XML node. It is used to correlate existing ELEM objects with new versions of XML nodes
to update the ELEM objects.
The functor arguments create_fn, destroy_fn, and update_fn for the update_from_xml
method define how objects are created, destructed, and updated. E.g.,

_list_model.update_from_xml(node,

[&] (Xml_node const &node) -> Item & {


return *new (alloc) Item(node); },

[&] (Item &item) { destroy(alloc, &item); },

[&] (Item &item, Xml_node const &node) { item.update(node); }


);

The elements are ordered according to the order of XML nodes.


The list model is a container owning the elements. Before destructing a list model, its
elements must be removed by calling update_from_xml with an Xml_node("<empty/>")
as argument, which results in the call of destroy_fn for each element.

509
8.16 XML processing

Genode List_model
class template
ELEM
List_model

~List_model()
update_from_xml(. . .)
for_each(. . .)
for_each(. . .)
with_first(. . .)

Template argument

ELEM typename

Header
repos/base/include/util/list_model.h

Genode List_model update_from_xml

Update data model according to the given XML node method

Arguments

- Xml_node const &

create_fn auto const &

destroy_fn auto const &

update_fn auto const &

Genode List_model for_each

Call functor fn for each const element const method


Argument

fn auto const &

Genode List_model for_each

Call functor fn for each non-const element method


Argument

fn auto const &

510
8.16 XML processing

Genode List_model with_first

Call functor fn with the first element of the list model const method
Argument

fn auto const &

Using this method combined with Element::next, the list model can be traversed
manually. This is handy in situations where the list-model elements are visited via
recursive function calls instead of a for_each loop.

XML buffering In some situations, it is convenient to keep a verbatim copy of XML


input as part of the internal data model. As the Xml_node is merely a light-weight
pointer into XML data, it cannot be stored when the underlying XML data is updated
dynamically. Instead, the XML input must be copied into the internal data model. The
Buffered_xml utility takes care of the backing-store allocation and copying of an exist-
ing Xml_node.

Genode Buffered_xml
Utility for buffering XML nodes class

Buffered_xml

Buffered_xml(. . .)
Buffered_xml(. . .)
Buffered_xml(. . .)
~Buffered_xml()
xml() : Xml_node
with_xml_node(. . .)

Accessor
xml Xml_node

Header
repos/os/include/os/buffered_xml.h

511
8.16 XML processing

Genode Buffered_xml Buffered_xml

Constructor for buffering a copy of the specified XML node constructor

Arguments

alloc Allocator &

node Xml_node

Exception

Allocator::Out_of_memory

Genode Buffered_xml Buffered_xml

Constructor for buffering generated XML constructor

Arguments

alloc Allocator &

name char const *


Name of top-level XML node
fn auto const &
Functor that takes an Xml_generator & as argument
size Min_size
Initial allocation size

Exception

Allocator::Out_of_memory

Genode Buffered_xml Buffered_xml


constructor
Arguments

alloc Allocator &

name char const *

fn auto const &

Genode Buffered_xml with_xml_node

Call functor fn with Xml_node const & as argument const method

Argument

fn auto const &

512
8.17 Component management

8.17. Component management

8.17.1. Shared objects

Genode Shared_object
class
Shared_object

Shared_object(. . .)
~Shared_object()
lookup(. . .) : T

Header
repos/base/include/base/shared_object.h

Genode Shared_object Shared_object

Load shared object and dependencies constructor

Arguments

- Env &

md_alloc Allocator &


Backing store for the linker’s dynamically allocated meta data
name char const *
ROM module name of shared object to load
bind Bind
Bind functions immediately (BIND_NOW) or on demand (BIND_LAZY)
keep Keep
Unload ELF object if no longer needed (DONT_KEEP), or keep ELF object
loaded at all times (KEEP)

Exception

Invalid_rom_module

Genode Shared_object ~Shared_object

Close and unload shared object destructor

513
8.17 Component management

Genode Shared_object lookup

Lookup a symbol in shared object and it’s dependencies const method template

Template argument

T typename
Default is void *

Argument

symbol const char *


Symbol name

Exception

Invalid_symbol

Return value
T Symbol address

Genode Address_info
class
Address_info

Address_info(. . .)

Header
repos/base/include/base/shared_object.h

Genode Address_info Address_info


constructor
Argument

addr addr_t

514
8.17 Component management

8.17.2. Child management

For components that manage a number of child components, each child is represented
by an instance of the Child class. This instance contains the policy to be applied to
the child (for example, how session requests are routed to services) and contains the
child’s execution environment including the PD session holding the child’s RAM and
capability quota.

Genode Child_policy
Child policy interface class

Child_policy

~Child_policy()
name() : Name
binary_name() : Binary_name
linker_name() : Linker_name
resolve_session_request(. . .) : Route
filter_session_args(. . .)
announce_service(. . .)
filter_session_affinity(. . .) : Affinity
exit(. . .)
ref_pd() : Pd_session &
ref_pd_cap() : Pd_session_capability
session_md_ram() : Ram_allocator &
yield_response()
resource_request(. . .)
init(. . .)
init(. . .)
server_id_space() : Id_space<Parent::Server> &
session_state_changed()
session_alloc_batch_size() : size_t
initiate_env_sessions() : bool
_with_address_space(. . .)
with_address_space(. . .)
start_initial_thread(. . .)
forked() : bool

Accessor
ref_pd_cap Pd_session_capability

Header
repos/base/include/base/child.h

A child-policy object is an argument to a Child. It is responsible for taking policy


decisions regarding the parent interface. Most importantly, it defines how session re-

515
8.17 Component management

quests are resolved and how session arguments are passed to servers when creating
sessions.
Genode Child_policy name

Name of the child used as the child’s label prefix pure virtual const method

Return value
Name

Genode Child_policy binary_name

ROM module name of the binary to start virtual const method

Return value
Binary_name

Genode Child_policy linker_name

ROM module name of the dynamic linker virtual const method

Return value
Linker_name

Genode Child_policy resolve_session_request

Determine service and server-side label for a given session re- pure virtual method
quest
Arguments

- Name const &

- Session_label const &

- Diag

Exception

Service_denied

Return value
Route Routing and policy-selection information for the session

516
8.17 Component management

Genode Child_policy filter_session_args

Apply transformations to session arguments virtual method

Argument

- Name const &

Genode Child_policy announce_service

Register a service provided by the child virtual method

Argument

- Name const &

Genode Child_policy filter_session_affinity

Apply session affinity policy virtual method

Argument

affinity Affinity const &


Affinity passed along with a session request

Return value
Affinity Affinity subordinated to the child policy

Genode Child_policy exit

Exit child virtual method


Argument

exit_value int

Genode Child_policy ref_pd

Reference PD session pure virtual method


Return value
Pd_session &

The PD session returned by this method is used for session cap-quota and RAM-quota
transfers.

517
8.17 Component management

Genode Child_policy session_md_ram

RAM allocator used as backing store for _session_md_alloc virtual method

Return value
Ram_allocator &

Genode Child_policy yield_response

Respond to the release of resources by the child virtual method

This method is called when the child confirms the release of resources in response to a
yield request.
Genode Child_policy resource_request

Take action on additional resource needs by the child virtual method

Argument

- Resource_args const &

Genode Child_policy init

Initialize the child’s CPU session virtual method


Arguments

- Cpu_session &

- Capability<Cpu_session>

The function may install an exception signal handler or assign CPU quota to the child.
Genode Child_policy init

Initialize the child’s PD session virtual method


Arguments

- Pd_session &

- Capability<Pd_session>

The function must define the child’s reference account and transfer the child’s initial
RAM and capability quotas. It may also install a region-map fault handler for the
child’s address space (Pd_session::address_space);.

518
8.17 Component management

Genode Child_policy server_id_space

ID space for sessions provided by the child virtual method

Exception

Nonexistent_id_space

Return value
Id_space<Parent::Server> &

Genode Child_policy session_state_changed

Notification hook invoked each time a session state is modified virtual method

Genode Child_policy session_alloc_batch_size

Granularity of allocating the backing store for session meta data virtual const method

Return value
size_t

Session meta data is allocated from ref_pd. The first batch of session-state objects is
allocated at child-construction time.
Genode Child_policy initiate_env_sessions
virtual const method
Return value
bool True to create the environment sessions at child construction

By returning false, it is possible to create Child objects without routing of their envi-
ronment sessions at construction time. Once the routing information is available, the
child’s environment sessions must be manually initiated by calling Child::initiate_env_sessions().

519
8.17 Component management

Genode Child_policy _with_address_space


virtual method
Arguments

pd Pd_session &

fn With_address_space_fn const &

Genode Child_policy with_address_space

Call functor fn with the child’s address-space region map as ar- method
gument
Arguments

pd Pd_session &

fn auto const &

In the common case where the child’s PD is provided by core, the address space is ac-
cessed via the Region_map RPC interface. However, in cases where the child’s PD
session interface is locally implemented - as is the case for a debug monitor - the
address space must be accessed by component-local method calls instead.
Genode Child_policy start_initial_thread

Start initial thread of the child at instruction pointer ip virtual method

Arguments

thread Capability<Cpu_thread>

ip addr_t

Genode Child_policy forked


virtual const method
Return value
bool True if ELF loading should be inhibited

520
8.17 Component management

Genode Child
Implementation of the parent interface that supports resource trading class

Child

Child(. . .)
~Child()
active() : bool
initiate_env_pd_session()
initiate_env_sessions()
env_sessions_closed() : bool
env_ram_quota() : Ram_quota
env_cap_quota() : Cap_quota
close_all_sessions()
for_each_session(. . .)
effective_quota(. . .) : Ram_quota
effective_quota(. . .) : Cap_quota
pd_session_cap() : Pd_session_capability
parent_cap() : Parent_capability
ram() : Ram_allocator &
ram() : Ram_allocator const &
cpu() : Cpu_session &
pd() : Pd_session &
pd() : Pd_session const &
session_factory() : Session_state::Factory &
yield(. . .)
notify_resource_avail()
heartbeat()
skipped_heartbeats() : unsigned
announce(. . .)
session_sigh(. . .)
session(. . .) : Session_capability
session_cap(. . .) : Session_capability
upgrade(. . .) : Upgrade_result
close(. . .) : Close_result
exit(. . .)
session_response(. . .)
deliver_session_cap(. . .)
main_thread_cap() : Thread_capability
resource_avail_sigh(. . .)
resource_request(. . .)
yield_sigh(. . .)
yield_request() : Resource_args
yield_response()
heartbeat_sigh(. . .)
heartbeat_response()

Accessors
pd_session_cap Pd_session_capability
parent_cap Parent_capability
ram Ram_allocator const &
pd Pd_session const &
521
main_thread_cap Thread_capability

Header
repos/base/include/base/child.h
8.17 Component management

There are three possible cases of how a session can be provided to a child: The service
is implemented locally, the session was obtained by asking our parent, or the session is
provided by one of our children.
These types must be differentiated for the quota management when a child issues the
closing of a session or transfers quota via our parent interface.
If we close a session to a local service, we transfer the session quota from our own ac-
count to the client.
If we close a parent session, we receive the session quota on our own account and must
transfer this amount to the session-closing child.
If we close a session provided by a server child, we close the session at the server, trans-
fer the session quota from the server’s RAM session to our account, and subsequently
transfer the same amount from our account to the client.
Genode Child Child
constructor
Arguments

rm Region_map &
Local address space, usually env.rm()
entrypoint Rpc_entrypoint &
Entrypoint used to serve the parent interface of the child
policy Child_policy &
Policy for the child

Exception

Service_denied The initial sessions for the child’s environment could not be estab-
lished

Genode Child ~Child


virtual destructor

On destruction of a child, we close all sessions of the child to other services.


Genode Child active
const method
Return value
bool True if the child has been started

After the child’s construction, the child is not always able to run immediately. In par-
ticular, a session of the child’s environment may still be pending. This method returns
true only if the child’s environment is completely initialized at the time of calling.
If all environment sessions are immediately available (as is the case for local services
or parent services), the return value is expected to be true. If this is not the case, one
of child’s environment sessions could not be established, e. g., the ROM session of the

522
8.17 Component management

binary could not be obtained.


Genode Child initiate_env_pd_session

Initialize the child’s PD session method

Genode Child initiate_env_sessions

Trigger the routing and creation of the child’s environment ses- method
sion

See the description of Child_policy::initiate_env_sessions.


Genode Child env_sessions_closed
const method
Return value
bool True if the child is safe to be destroyed

The child must not be destroyed until all environment sessions are closed at the respec-
tive servers. Otherwise, the session state, which is kept as part of the child object may
be gone before the close request reaches the server.
Genode Child env_ram_quota

Quota unconditionally consumed by the child’s environment class function

Return value
Ram_quota

Genode Child env_cap_quota


class function
Return value
Cap_quota

Genode Child close_all_sessions


method

Genode Child for_each_session


const method
Argument

fn auto const &

523
8.17 Component management

Genode Child effective_quota

Deduce env session costs from usable RAM quota class function

Argument

quota Ram_quota

Return value
Ram_quota

Genode Child effective_quota

Deduce env session costs from usable cap quota class function

Argument

quota Cap_quota

Return value
Cap_quota

Genode Child ram


method
Return value
Ram_allocator &

Genode Child cpu


method
Return value
Cpu_session &

Genode Child pd
method
Return value
Pd_session &

Genode Child session_factory

Request factory for creating session-state objects method

Return value
Session_state::Factory &

524
8.17 Component management

Genode Child yield

Instruct the child to yield resources method

Argument

args Resource_args const &

By calling this method, the child will be notified about the need to release the speci-
fied amount of resources. For more details about the protocol between a child and its
parent, refer to the description given in parent/parent.h.
Genode Child notify_resource_avail

Notify the child about newly available resources const method

Genode Child heartbeat

Notify the child to give a lifesign method

Genode Child skipped_heartbeats


const method
Return value
unsigned Number of missing heartbeats since the last response from the child

8.17.3. Composition of subsystems

Whereas the Child class aids the sandboxed execution of one child component, the
interplay between such a child with its environment must be implemented manually,
which can quickly become complex. This is where the sandbox API enters the picture.
It makes the full feature set of the init component available to C++ programs, including
the hosting of an arbitrary number of children, the routing of sessions among them,
the balancing of resources, and dynamic reconfiguration. In addition to the hosting of
components, the sandbox API also allows the program to interact with the sandboxed
children by providing locally implemented services.
A sandbox instance consumes configurations and produces reports, both in the form
of the XML format as known from the init component. An example for using the API
can be found at repos/os/src/test/sandbox/.

525
8.17 Component management

Genode Sandbox
class
Sandbox

Sandbox(. . .)
Sandbox(. . .)
apply_config(. . .)
generate_state_report(. . .)

Header
repos/os/include/sandbox/sandbox.h

Genode Sandbox Sandbox


constructor
Arguments

- Env &

- State_handler &

Genode Sandbox Sandbox

Constructor designated for monitoring PD-session operations constructor

Arguments

- Env &

- State_handler &

- Pd_intrinsics &

The Pd_intrinsics argument allows for the customization of the reference PD session
used for quota transfers between the sandboxed children and the local runtime.

526
8.17 Component management

Genode Sandbox apply_config


method
Argument

- Xml_node const &

Genode Sandbox generate_state_report

Generate state report as configured by the <report> config node const method

Argument

- Xml_generator &

Exception

Xml_generator::Buffer_exceeded

527
8.18 Utilities for user-level device drivers

8.18. Utilities for user-level device drivers

To avoid common implementation bugs in device drivers, the framework facilitates the
declaration of hardware register and bit layouts in the form of C++ types. By subject-
ing the device driver’s interaction with the hardware to the to C++ type system, the
consistency of register accesses with the hardware specifications can be maintained au-
tomatically. Such hardware specifications are declarative and can be cleanly separated
from the program logic of the driver. The actual driver program is relieved from any
intrinsics in the form of bit-masking operations.
The MMIO access utilities comes in the form of two header files located at util/regis-
ter.h and util/mmio.h.

8.18.1. Register declarations

The class templates found in util/register.h provide a means to express register layouts
using C++ types. In a way, these templates make up for C++’s missing facility to define
accurate bitfields. The following example uses the Register class template to define a
register as well as a bitfield within the register:

struct Vaporizer : Register<16>


{
struct Enable : Bitfield<2,1> { };
struct State : Bitfield<3,3> {
enum{ SOLID = 1, LIQUID = 2, GASSY = 3 };
};

static void write (access_t value);


static access_t read ();
};

In the example, Vaporizer is a 16-bit register, which is expressed via the Register
template argument. The Register class template allows for accessing register content
at a finer granularity than the whole register width. To give a specific part of the register
a name, the Register::Bitfield class template is used. It describes a bit region within
the range of the compound register. The bit 2 corresponds to true if the device is enabled
and bits 3 to 5 encode the State. To access the actual register, the methods read() and
write() must be provided as back end, which performs the access of the whole register.
Once defined, the Vaporizer offers a handy way to access the individual parts of the
register, for example:

528
8.18 Utilities for user-level device drivers

/* read the whole register content */


Vaporizer::access_t r = Vaporizer::read();

/* clear a bit field */


Vaporizer::Enable::clear(r);

/* read a bit field value */


unsigned old_state = Vaporizer::State::get(r);

/* assign new bit field value */


Vaporizer::State::set(r, Vaporizer::State::LIQUID);

/* write whole register */


Vaporizer::write(r);

Bitfields that span multiple registers The register interface of hardware devices
may be designed such that bitfields are not always consecutive in a single register. For
example, values of the HDMI configuration of the Exynos-5 SoC are scattered over mul-
tiple hardware registers. The problem is best illustrated by the following example of a
hypothetical timer device. The bits of the clock count value are scattered across two
hardware registers, the lower 6 bits of the 16-bit-wide register 0x2, and two portions of
the 32-bit-wide register 0x4. A declaration of those registers would look like this:

struct Clock_2 : Register<0x2, 16>


{
struct Value : Bitfield<0, 6> { };
};

struct Clock_1 : Register<0x4, 32>


{
struct Value_2 : Bitfield<2, 13> { };
struct Value_1 : Bitfield<18, 7> { };
};

Writing a clock value needs consecutive write accesses to both registers with bit shift
operations applied to the value:

write<Clock_1::Value_1>(clk);
write<Clock_1::Value_2>(clk >> 7);
write<Clock_2::Value>(clk >> 20);

The new Bitset_2 and Bitset_3 class templates contained in util/register.h allow
the user to compose a logical bit field from 2 or 3 physical bit fields. The order of

529
8.18 Utilities for user-level device drivers

the template arguments expresses the order of physical bits in the logical bit set. Each
argument can be a register, a bit field, or another bit set. The declaration of such a
composed bit set for the example above looks as follows:

struct Clock : Bitset_3<Clock_1::Value_1,


Clock_1::Value_2,
Clock_2::Value> { };

With this declaration in place, the driver code becomes as simple as:

write<Clock>(clk);

Under the hood, the framework performs all needed consecutive write operations on
the registers 0x2 and 0x4.

8.18.2. Memory-mapped I/O

The utilities provided by util/mmio.h use the Register template class as a building
block to provide easy-to-use access to memory-mapped I/O registers. The Mmio class
represents a memory-mapped I/O region taking its local base address as constructor
argument. The following example illustrates its use:

class Timer : Mmio


{
struct Value : Register<0x0, 32> { };
struct Control : Register<0x4, 8> {
struct Enable : Bitfield<0,1> { };
struct Irq : Bitfield<3,1> { };
struct Method : Bitfield<1,2>
{
enum { ONCE = 1, RELOAD = 2, CYCLE = 3 };
};
};

public:

Timer(addr_t base) : Mmio(base) { }

void enable();
void set_timeout(Value::access_t duration);
bool irq_raised();
};

530
8.18 Utilities for user-level device drivers

The memory-mapped timer device consists of two registers: The 32-bit Value regis-
ter and the 8-bit Control register. They are located at the MMIO offsets 0x0 and 0x4,
respectively. Some parts of the Control register have specific meanings as expressed
by the Bitfield definitions within the Control struct.
Using these declarations, accessing the individual bits becomes almost a verbatim
description of how the device is used. For example:

void enable()
{
/* access an individual bitfield */
write<Control::Enable>(true);
}

void set_timeout(Value::access_t duration)


{
/* write complete content of a register */
write<Value>(duration);

/* write all bitfields as one transaction */


write<Control>(Control::Enable::bits(1) |
Control::Method::bits(Control::Method::ONCE) |
Control::Irq::bits(0));
}

bool irq_raised()
{
return read<Control::Irq>();
}

531

You might also like