Minimal CHERI support #1469

heshamelmatary · 2025-05-29T16:32:09Z

This is another attempt at addressing various discussions (with @lsf37, @Indanz, and @kent-mcleod) on the RFC and previous PRs. It's a stripped out version with the following:

Only hybrid kernel
Enables running unmodified userspace binaries, C, Rust, and hybrid and/or purecap CHERI, all side-by-side simultaneously.
Cuts the LoC changes and commits to less than half (compared to [RFC-15] Add experimental CHERI support (hybrid kernel) #1344) in order to ease the review and upstreaming process.
Only targets standard CHERI-RISC-V [1]. Further CHERI platforms/archs will be added later if this PR is accepted.
No CHERI caps are passed in the IPC buffer and all of its types are kept as is.
No CHERI caps are passed during system calls at all.
Adds 3 new system calls for CHERI: read/write CHERI registers, and a WriteCapMem to write a CHERI cap to a remote protection domain's memory. The kernel is the only thing that can construct valid CHERI capabilities iff passed valid TCB+VSPace seL4 caps.
This port has been tested with Microkit on Codasip's QEMU and hardware platforms (x730) [2].

Please note the current standard CHERI-RISC-V is undergoing ARC reviews before being ratified; some things may change. This is a draft PR to resurrect the discussions and to serve as a reference implementation for the RFC.

[1] http://github.com/riscv/riscv-cheri
[2] https://codasip.com/solutions/riscv-processor-safety-security/cheri/x730-risc-v-application-processor/

Indanz

Overall much better than the other PR. Main concern is the vptr_t change, that's probably better done explicitly like you did with rword_t, so we can see where it is actually needed.

Edit: Forgot to mention, but WriteCapMem should probably require a page cap and perhaps the caller's vspace cap too. The logic is that if you could map and write to the memory yourself, you're also allowed to write to it via this system call. Without something like this, WriteCapMem would grant access to memory a task is not supposed to have.

Indanz · 2025-05-29T20:23:53Z

config.cmake

+# Copyright 2024, Capabilities Limited
+# CHERI support contributed by Capabilities Limited was developed by Hesham Almatary


Please remove all of these except for the ones in newly added files, thanks.

If everyone who ever edits any files slightly would add their copyright to the file, things would explode and become very cumbersome to maintain, even for a small project like seL4. We have git history for these kind of details.

Those copyright lines are only intended for new files and those with non-trivial changes, not any edited files. Some are indeed mistakenly left in this PR from refactoring from previous PRs (hybrid, purecap, Morello, +10 CHERI platforms, etc), like this file where there was more non-trivial changes. I've done another iteration for this PR, and now only 13 files have them.

Indanz · 2025-05-29T20:30:41Z

include/arch/riscv/arch/32/mode/hardware.h

+#define LOAD  lc
+#define STORE sc


No opinion yet on what's better to name differently, the full ones or the integer-only ones. But considering CHERI is the reason for this hassle, it might be clearer to swap them around. We'll see.

But while you're touching all these lines anyway, can't you get rid or shorten the horrible LOAD_S and STORE_S?

LOAD/STORE are just intended to load GPRs+CHERI CSRs, depending on the underlying architecture. Integer loads/stores are just used in a couple of places when CHERI is enabled (e.g., to save sstatus), so I just followed the common case.

I'm fine to shorten the macros. Any suggestions? LD/ST, LOAD/STORE, LR/SR?

Indanz · 2025-05-29T20:34:29Z

include/arch/riscv/arch/32/mode/object/structures.bf

+#if defined(CONFIG_HAVE_CHERI)
+    field     FSR               12
+#else
    field     FSR               5
    padding                     7
+#endif


Any downsides to having FSR always 12 bits? Or are the new bits at the bottom?

Happy to make it always 12 in the future, but I've been trying to guard CHERI-specific changes when I can and hide it from verification in this PR when building the kernel with CHERI disabled.

Fair enough, and I agree with that, that's the right attitude. But I was curious if this can be consolidated without downsides, to know what our options are. Then it could be done to reduce the difference and to avoid future pain. Verification changes for details like this should be very small.

Indanz · 2025-05-29T20:57:15Z

include/arch/riscv/arch/types.h


+#if defined(CONFIG_HAVE_CHERI)
+typedef __uintcap_t rword_t;
+typedef __uintcap_t vptr_t;


Most vptr_t instances don't need to be CHERI pointers I think, as the kernel itself is still using normal pointers.

But I guess page tables still need to contain valid CHERI pointers even when disabled for kernel mode?

True for vptr_t but it didn't hurt. The cases where vptr_t is needed to be a capability are only for:

User's entry point for the root task

User's IPC buffer pointer for the root task

User's BootInfo pointer for the root task

We have two options for the above:

Retype vptr_t to be __uintcap_t as I did in this PR. __uintcap_t is a type that's suggested to be used when the value could be either a capability pointer or an integer pointer, or just a normal integer.

Change the existing types of the above 3 cases to rword_t (or even better, void *__capability, as they'll always need to hold capability pointers).

But I guess page tables still need to contain valid CHERI pointers even when disabled for kernel mode?
For the kernel, yes. But not sure I understand your question and how it's related to vptr_t?

I'd vote for rword_t (or whatever the final name will be).

Using void* for pointers to a different address space than the kernel's is absolutely wrong.

Using void* for pointers (or pointers types, in general, instead of integer types for pointers like pptr_t/vptr_t etc) is IMO the right approach and better coding practice. It doesn't matter if it's a different address space or not here, CHERI protection isn't expected in the kernel nor all of the user-level security implications we discussed before; the kernel is trusted to never get hacked nor mis-use them nor de-reference, exactly the same as its current usage of vptr_t. But that's another discussion and I understand this will break verification if we try to change all current pointer types that use integer types (word_t) to use actual C pointer types (type *), as I did in the purecap kernel.

I'd use void *__user (that you recommended before) for any capability pointer held in the kernel AND is going to be exported to the user, but NEVER will be de-referenced by the kernel. This currently includes the above 3 cases I mentioned, and for the new system calls that construct pointer capabilities on behalf of the user.

Anyway, I tried to change the above 3 use cases to use rword_t and/or void *__user, but it's bit disruptive as those are passed down to other functions as well, so I'll have to make unnecessary more changes touching more files and code which I've been trying to avoid.

include/basic_types.h

src/object/tcb.c

Indanz · 2025-05-29T22:05:43Z

src/cheri/cheri.c

+exception_t handle_SysCheriWriteRegister(cap_t tcb_cap, word_t *ipc_buffer)
+{
+    cap_t vRootCap;
+    void *__capability constructed_cap;


Better to use rword_t instead of user space pointers in kernel space...

But I'll postpone detailed review for later, this is clearly a quick proof of concept.

The rule of thumbs I follow (and is suggested by the CHERI programming guide [1]) is to use rword_t/uintcap_t for things that may contain either capabilities or integers, and use void *__capability for things that will always need to be capability pointers. Also all of the builtin CHERI macros used expect void *__capability, using rword_t will incur more unnecessary casting.

[1] https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-947.pdf

They are talking about pointers that are valid within the same address space, not cross address space pointers you are dealing with here!

And you made wrapper functions anyway, they can do any casting necessary. Now you have all these senseless casts to void* by the callers, that's stupid.

src/kernel/boot.c

include/api/debug.h

src/object/tcb.c

midnightveil · 2025-05-30T07:56:25Z

include/arch/riscv/arch/machine.h

+    asm volatile(
+#if defined(CONFIG_HAVE_CHERI)
+        "modesw.cap             \n"
+        ".option push           \n"
+        ".option capmode        \n"
+#endif
+        "csrr  %0, " SSCRATCH  "\n"
+#if defined(CONFIG_HAVE_CHERI)
+        ".option pop            \n"
+        "modesw.int             \n"
+#endif
+        : "="ASM_REG_CONSTR(temp));


You've abstracted the inline assembly with these SSCRATCH and ASM_REG_CONSTR, and with the reg() macros too.

But then all the assembly gets ifdef'd anyway for the mode switches and various things.

Why not just use the appropriate names directly? You're already duplicating most of the assembly.. And removes the indirection through the macros.

(Maybe other people disagree here, but I don't really see the point. If you didn't have the ifdef cheri in all the places you use assembly then they'd serve a point but it's already different everywhere)

Less code duplication; you have > 32 LOADs/STOREs in assembly. Not all assembly is ifdef'd, most of it isn't actually.

Ease of future maintenance and less breaking. e.g., when in the future someone needs to change this save/restore assembly, they won't have to keep maintaining two separate ifdef blocks; one for CHERI, and one for non-CHERI, especially if they don't know much about CHERI.

But both at the price of much less readable assembly, so I agree with @midnightveil. And hiding it behind macro's makes people unaware they are modifying multiple versions at once, so chances are higher they accidentally break something for CHERI. In addition to normal RISC-V assembly, they also need to know CHERI specific macros. So the burden is much greater with this mess than a bit of very straightforward code duplication. And I say that as someone who in general hates code duplication.

heshamelmatary · 2025-05-30T11:31:01Z

@Indanz

Overall much better than the other PR. Main concern is the vptr_t change, that's probably better done explicitly like you did with rword_t, so we can see where it is actually needed.

Great to know! vptr_t can change, no issue, it's just an implementation discussion. But generally speaking, it'd be great if you can comment whether this addresses the higher-level design concerns/blockers you had before on the RFC.

Edit: Forgot to mention, but WriteCapMem should probably require a page cap and perhaps the caller's vspace cap too. The logic is that if you could map and write to the memory yourself, you're also allowed to write to it via this system call. Without something like this, WriteCapMem would grant access to memory a task is not supposed to have.

Agreed, I think I also suggested to do that; passing a page capability for this system call before as well. I'll experiment with it.

heshamelmatary · 2025-05-30T15:55:46Z

On Fri, 30 May 2025 at 11:15, Indan Zupancic ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In include/arch/riscv/arch/types.h <#1469 (comment)>: > @@ -28,6 +29,14 @@ typedef word_t dom_t; typedef uint64_t timestamp_t; +#if defined(CONFIG_HAVE_CHERI) +typedef __uintcap_t rword_t; +typedef __uintcap_t vptr_t; I'd vote for rword_t (or whatever the final name will be). Using void* for pointers to a different address space than the kernel's is absolutely wrong.

There’s not any different than how the kernel treats vptr_t at the moment, where it only holds them, but never de-references them. CHERI protection doesn’t even apply to the seL4 kernel in this hybrid set up that we have to worry about that. void *__capability (or let’s make it void *__user as you suggested in another comment) is merely a type for holding user capability pointers here, for the user to use, never for the kernel. I’d argue it’s even a better practise and better for readability; if it’s a user capability pointer, use a proper C pointer type. —

…

Reply to this email directly, view it on GitHub <#1469 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAM2EOXNPUFRLPPHUVEVYSD3BAVU7AVCNFSM6AAAAAB6F55IGCVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDQOBQHA3DGMJZGQ> . You are receiving this because you authored the thread.Message ID: ***@***.***>

heshamelmatary · 2025-06-05T11:44:35Z

@Indanz

Overall much better than the other PR. Main concern is the vptr_t change, that's probably better done explicitly like you did with rword_t, so we can see where it is actually needed.

Great to know! vptr_t can change, no issue, it's just an implementation discussion. But generally speaking, it'd be great if you can comment whether this addresses the higher-level design concerns/blockers you had before on the RFC.

Edit: Forgot to mention, but WriteCapMem should probably require a page cap and perhaps the caller's vspace cap too. The logic is that if you could map and write to the memory yourself, you're also allowed to write to it via this system call. Without something like this, WriteCapMem would grant access to memory a task is not supposed to have.

Agreed, I think I also suggested to do that; passing a page capability for this system call before as well. I'll experiment with it.

This is now implemented, just passing an extra page cap intended to write a capability to.

heshamelmatary · 2025-06-05T11:50:01Z

The preprocess failure looks trivial. It'd be interesting to run the verification tests and see if it fails at all.

This change adds a new rword_t type for variables that may hold CHERI capabilities at any point in the kernel. rword_t: register word type newly added to *always* hold capability-width variables in CHERI mode (e.g., for hybrid/purecap user-space register context). - In non-CHERI mode, this is just an integer and corresponds to unsigned long (word_t). - In any CHERI mode, this is a capability-width type and corresponds to CHERI's __uintcap_t. word_t: is used by seL4 as some type that can hold anything (eg unsigned long, and is the most widely used for mixed use cases such as pointers, integers, registers, etc. It's left as it is to only represent "integers". vptr_t: conventionally only holds user pointers. In order to support both hybrid and purecap CHERI userspace, any user pointers held in vptr_t need to be capabilities, hence this type is changed to __uintcap_t when in CHERI-mode. Signed-off-by: Hesham Almatary <[email protected]>

HW registers have another format and size when CHERI is enabled. This needs to be able to hold full CHERI HW registers all the time when CHERI is enabled. Signed-off-by: Hesham Almatary <[email protected]>

This commit adds core files to manipulate CHERI capabilities and does a basic port to architecture-independent code to build and run the kernel in hybrid mode. It also adds shared system support in shared CMake and C files for supported CHERI architectures. In hybrid mode, any pointer created by the kernel and passed to user, needs to be CHERI capabilities for purecap user; this includes the IPC buffer, hence it needs to be manually annotated as a CHERI capability (with the __capability keyword). __user annotation is added and is defined to a __capability. It is suggested for any pointer capability that holds a user address, __user is used for better reading/practise. The kernel never de-references those. This is a bit similar to the current kernel's usage of vptr_t where it holds integer pointers to the user but the kernel never de-references them. When building with a CHERI toolchain in CHERI mode, the compiler defines __has_feature(capabilities). The "__has_feature" macro does not exist in some old GCC toolchains, so a macro is defined to 0 here just for userspace backward compatability (e.g., when compiling an seL4 user program with an old non-CHERI toolchain on a CHERI-enabled seL4 kernel). Signed-off-by: Hesham Almatary <[email protected]>

This is an architectural port of the standard CHERI-RISC-V [1], from RISC-V International. The kernel is hybrid and minimal, but it enables running the following userspace: 1- Unmodified binaries 2- Unmodified C or other projects (e.g., Rust) 3- Hybrid CHERI (eg that use void *__capability) 4- Purecap CHERI, for complete spatial memory safety This port has been tested with Microkit on Codasip's QEMU and hardware platforms (x730) [2]. [1] https://github.com/riscv/riscv-cheri [2] https://codasip.com/solutions/riscv-processor-safety-security/cheri Signed-off-by: Hesham Almatary <[email protected]>

This commit adds 3 new system calls if CHERI is enabled: 1- CheriWriteRegister: To construct/write a TCB's CHERI HW register from decomposed CHERI capability fields passed from the user as integer arguments. 2- CheriReadRegister: To read a TCB's CHERI HW register and return it to the user as decomposed integer fields representing CHERI capability fields. 3- CheriWriteMemoryCap: To construct/write a CHERI capability to a TCB/VSpace (e.g., Microkit protection domain) from decomposed CHERI capability fields passed from the user as integer arguments. Rules: - Only the kernel can construct valid CHERI caps - No tagged CHERI caps are passed via syscall args, IPC buffer, or syscall ret. - Valid tagged CHERI caps are constructed only in the following conditions: 1- If the user passes *BOTH* valid TCB and VSpace seL4 caps to these system calls. This authorises the caller to construct a new tagged CHERI cap from the kernel's RootCheriCap. 2- For CheriWriteRegister, if the src/dest register index is tagged *and* unsealed, and no valid VSpace cap is provided. The kernel will try to derive a CHERI cap off the destination CHERI HW reg. The requested CHERI cap must not violate CHERI rules or increase permissions or bounds of the destination CHERI register, otherwise an untagged cap will be written. CHERI-aware root task and servers (e.g., Microkit's monitor) must use these system calls if CHERI is enabled in cases like: - Creating a new thread and writing its entry point, stack, DDC, etc. - Passing code, data, rodata CHERI caps to a newly created thread. - Setting up a new thread's stack that may contain valid CHERI caps. For example, a POSIX server setting up argv[], auxv[], etc. - Setting up and writing ELF symbols that contain valid pointers. For instance, Microkit's memory regions (that have map setvar), Microkit protection domains' IPC buffer pointer address, etc. For more details and design discussions, see [1]. [1] http://github.com/seL4/rfcs/pull/21 Signed-off-by: Hesham Almatary <[email protected]>

Indanz reviewed May 29, 2025

View reviewed changes

midnightveil reviewed May 30, 2025

View reviewed changes

heshamelmatary force-pushed the std-cheri-riscv branch from 419ce19 to e335350 Compare June 5, 2025 11:27

heshamelmatary force-pushed the std-cheri-riscv branch from dd2d680 to 3874a67 Compare June 8, 2025 14:25

heshamelmatary added 3 commits June 11, 2025 15:03

[cheri][registers] retype HW reg variables

07f4652

HW registers have another format and size when CHERI is enabled. This needs to be able to hold full CHERI HW registers all the time when CHERI is enabled. Signed-off-by: Hesham Almatary <[email protected]>

heshamelmatary force-pushed the std-cheri-riscv branch 2 times, most recently from 893c635 to 7e097df Compare June 11, 2025 16:07

heshamelmatary added 2 commits June 13, 2025 11:16

heshamelmatary force-pushed the std-cheri-riscv branch from 7e097df to 11acb06 Compare June 13, 2025 11:17

		# Copyright 2024, Capabilities Limited
		# CHERI support contributed by Capabilities Limited was developed by Hesham Almatary

		#define LOAD lc
		#define STORE sc

Minimal CHERI support #1469

Are you sure you want to change the base?

Minimal CHERI support #1469

Uh oh!

Conversation

heshamelmatary commented May 29, 2025

Uh oh!

Indanz left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

midnightveil May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

heshamelmatary commented May 30, 2025

Uh oh!

heshamelmatary commented May 30, 2025 via email

Uh oh!

heshamelmatary commented Jun 5, 2025

Uh oh!

heshamelmatary commented Jun 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Indanz left a comment •

edited

Loading

midnightveil May 30, 2025 •

edited

Loading