-
-
Notifications
You must be signed in to change notification settings - Fork 779
[RFC] Yield-WaitFor syscall #3577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I think we want something like this. This yield-for makes mixing async and sync code in userspace easier. Currently, it is fairly easy to write all async or all sync code in userspace, but when mixing them async callbacks can lead to unexpected call chains when using The decision to make is whether to go with this lightweight version, or something more integrated (or something else). Instead (or in addition to), we could add yield-for-blocking, which would remove the upcall entirely, and instead pass the upcall arguments to userspace via the return arguments of yield-for-blocking. Advantages of yield-for-blocking:
Disadvantages:
Having articulated this, I see the appeal of yield-for. I think Pat pointed out that the complexity in userspace code is fairly easy to hide in a library layer, so that driver code would look like the yield-for-blocking case. Perhaps the prudent move is to address the complexity of writing sync and async code in userspace with yield-for, and not try to introduce blocking semantics before we know the aysnc version is insufficient. |
|
A quick note,
There are probably other games that could be played to ease this, but it's all more complex than the lightweight change I think we should start with as-proposed here. |
But the arg1 arg2 arg3 are going back up (kernel to userspace) and driver num and subscribe num are going down (userspace to kernel). |
|
Oh, you were thinking some more library magic here? I guess that works so long as the lower layer does the remapping and the pointer writing. I don't think there is any way to pass all the pointers for arg1-arg3 in the syscall and have the kernel do the write, but that's not necessary.... I think the below would work? // libtock.c
int yield_for_blocking(struct upcall_id, &arg1, &arg2, &arg3) {
register uint32_t r0 __asm__ ("r0") = YIELD_FOR_BLOCKING_ID;
register uint32_t r1 __asm__ ("r1") = upcall_id.drv_num;
register uint32_t r2 __asm__ ("r2") = upcall_id.sub_num;
register int retval __asm__ ("r0");
register int rv1 __asm__ ("r1");
register int rv2 __asm__ ("r2");
register int rv3 __asm__ ("r3");
__asm__ volatile (
"svc 0"
: "=r" (retval), "=r" (rv1), "=r" (rv2), "=r" (rv3)
: "r" (r0), "r" (r1), "r" (r2)
: "memory");
if (retval & YIELD_FOR_ACTUALLY_YIELDED_FLAG_MASK) {
*arg1 = rv1;
*arg2 = rv2;
*arg3 = rv3;
}
return retval;
} |
|
Yes that is what I was imagining. |
|
The C wrapper could just take five arguments though instead of a struct: |
|
True, though, since we're size-optimizing, if that function didn't get inlined everywhere, adding the fifth argument would require spilling to the stack [assuming |
This function seems like something that should get inlined everywhere, provided that it is being built with reasonable optimization settings, which should be the case for any project particularly concerned about size. So I would advocate for the five argument function definition, though I don't feel super strongly about it. If we used your approach, wouldn't we need to pass a pointer to upcall_id in order to not spill to the stack? That said, I think that the disadvantages Brad described are valid:
These are both disadvantages that do not exist with
|
Yes, that's what I meant by "[assuming I'm generally wary about relying on compiler optimizations for performance correctness, but I think we can shelve this side-hypothetical for now.
Why is that limitation in place? That's a pretty crippling limitation for a lot of applications — can't have something like a periodic timer and a event-based sensor subscription live at the same time? That's not really an acceptable or realistic limitation for the long term. If the issue is concurrent execution concerns of callbacks, I'm not opposed to something of the spirit of |
|
One thing I had been ruminating on was whether there would be value in something closer to a In practice, I think the ergonomics around first |
I'm not sure it is? In my little dabble yesterday it seems like subscribe in libtock-rs is essentially the
It's really hard to discuss blocking command without something like this PR. What exactly is blocking command? Critically, how does a capsule give the blocking command its return data? Also, can someone write its It seems like the major disadvantage is going to be: when writing a capsule driver, after my interrupt fires, do I call Imagine we implement It seems like it might not be too big of a step to implement command-yield-for-blocking, which is a command immediately followed by a yield-for-blocking in a single system call (how exactly to do that tbd). I say this because that to me seems more plausible than re-writing all of our capsules to have both an async and a sync version (if in fact that is what is required for blocking_command). |
There is more context on the decision to not support independent tasks at tock/libtock-rs#334. The short version is that supporting that requires making a lot of important design decisions that I did not have the data to make at the time. I also want to re-evaluate whether futures are acceptable in |
|
That state of libtock-rs is much more sane. w.r.t. to an eventual more async world, there's some more complex stuff in libtock-c apps, but really nothing that would demand more capability than what is sounds like libtock-rs already supports. I do think one of the good outcomes from TockWorld6 is a push for the core team to get more parity in libtock-c / libtock-rs examples, so if I'm wrong, I suspect we'll see soon :). |
doc/reference/trd104-syscalls.md
Outdated
| _Note:_ In the case that the specified upcall is set to the Null Upcall, | ||
| no upcall will execute, but Yield-WaitFor will still return. In this | ||
| case, the contents of r0-r3 upon return are UNDEFINED and should not be | ||
| relied on in any way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, this is great except why not have r0-r3 be the values that would have been passed a non-null upcall? This would allow many cases to avoid callbacks entirely and just process the results directly, and doesn't cost anything (I don't think).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is yield-for-blocking. Or a variant at least. I see what you are asking. But, given
The Tock kernel MUST NOT invoke the Null Upcall.
what would those be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so, the reason I didn't do this is because I couldn't decide the right behavior for r3.
TRD104 starts by describing upcalls abstractly as something that have (up to) 4 return arguments passed in r0-r3. We have one type of upcall the kernel actually emits right now, namely the upcall produced in response to subscribe [n.b., nothing in this TRD forbids other upcall signatures in the future*]. For clarity, I will use SubscribableUpcall to refer to an upcall invoked on a pointer passed via Subscribe.
Much to my surprise, as far as I can tell, the signature of SubscribableUpcall is not expressly defined anywhere except in §5.2 of this TRD, where the C prototype for a subscribe_upcall is given as an example.
Indeed, we define the
FunctionCallobject passed to UKB with this note:Struct that defines a upcall that can be passed to a process. The upcall takes four arguments that are Driver and upcall specific, so they are represented generically here.
In practice, the type signature for an upcall is established by the kernel in kernel::upcall::schedule() and an implicit assumption in the definition/creation of an Upcall object that upcalls come only from subscribes. But that's just an implementation artifact at the moment, not part of our ABI*.
Now, if YieldFor were to say something like 'sets r0-r3 to the Upcall Arguments if the current Upcall is the NullUpcall', then we would expect all Upcall Arguments to be set, right? Now, for the case of a SubscribableUpcall r0-r2 are clear, but what do we do with r3, which would be the userdata pointer? Does YieldFor need to do different things based on which type of Upcall it's passing?
We could further amend TRD104 to formally specify the function signature for all upcalls, and then define YieldFor to skip r3. Or define SubscribableUpcall and specify YieldFor behavior only for that upcall type.
I was shooting for minimum delta with first RFC, and I don't think we can do this "skip the callback" without also making more changes around Upcall definition. I lean to formalizing SubscribableUpcall as a no-overhead option for today (since there is only one Upcall type still) while leaving the door open to other Upcall signatures in the future.
I do think the "skip the callback" would be nice. I can update this RFC to capture a version of that.
*Subscribe does start like this:
The Subscribe system call class is how a userspace process registers upcalls with the kernel.
Which if we really read into the "a" there maybe suggests the 1:1 mapping, but I wouldn't blink an eye at a future section that said something like "The GetASignal system call is another way userspace processes register upcalls with the kernel". If there really is OneTrueUpcallSignature, this TRD needs to say that much more directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*Subscribe does start like this:
The Subscribe system call class is how a userspace process registers upcalls with the kernel.
Which if we really read into the "a" there maybe suggests the 1:1 mapping, but I wouldn't blink an eye at a future section that said something like "The
GetASignalsystem call is another way userspace processes register upcalls with the kernel". If there really is OneTrueUpcallSignature, this TRD needs to say that much more directly.
But wouldn't we need to add that everywhere? A command syscall is THE ONLY way to issue a command. Yield is THE ONLY way to yield, etc. It seems clear to me that subscribe is the only way to subscribe, since that is all it does.
But yes we need to document the upcall signature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it's clear that "subscribe is the only way to subscribe".
What I do not think is clear is that "subscribe is the only way to get upcalls". Upcalls are defined completely independently of subscribe in the previous section. As written, subscribe is just one thing (and currently happens to be the only thing) than can generate an upcall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to say anything about subscribe being the only way to set upcalls, any more than we need to say command is the only way to instruct the kernel to start an operation. Commands are documented elsewhere. But, we do need to define what an upcall is, in my opinion.
|
I'm strongly supportive of something very close to this. High bit
Some more details separated so they can be quoted separatelyMore on 1: it's worth looking through our various libtock-c examples and drivers and seeing where we would actually replace calls to the current library More on 2: with some specialization in userspace as well as some sort of combined system-call system call ("please do a (There are no!) Disadvantages@bradjc points out some disadvantages. These are very well worth considering. I think an evolution of this proposal avoids these pitfalls.
Yes. As suggested later in the thread, either renaming or aliasing
Agree completely. There is a remaining question of what to do with |
Naming versions of the Yield variantWe've now described several variations of the proposed yield variant and I believe I've added to the confusion by conflating names for them above. Let's refer to them as follows:
|
|
What is the usecase for calling subscribe and using |
One simple example may be logging statistics on how often something happens. If the callback is able to return different values to the syscall site in r0-r3, it might be useful for filtering or customizing the result... Just general modularity. Indeed, I suspect this will rarely be used. |
|
I've updated this to match the There are some design choices not fully elucidated in the RFC yet, as I'd like us to decide what we want to do about #3577 (comment) before fleshing that text out completely. PoC code is updated (again, compile-tested only) to realize the proposed interface. |
Is |
doc/reference/trd104-syscalls.md
Outdated
| | yield-param-2 | r2 | | ||
| | yield-param-3 | r3 | | ||
|
|
||
| The Yield system call class has no return value. This is because |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would no longer be true.
|
With So libtock tries to write: int my_command_sync() {
driver_command();
asm("yield wait-for-callbackifpresent");
// did an upcall execute?
// or is r0-r3 valid?
return ??
}But the then the user at some point called Is this possible or am I missing something? |
As currently written, this is correct, yes. It assumes that userspace tracks whether it has subscribed a callback or not correctly, and if userspace cares that a callback ran, the callback can set a flag. We could use |
I propose we pass all yield return arguments via a pointer specified by Otherwise, I don't know how we would use OR, I prefer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great if I was missing something, but it seems with Yield-WaitFor-CallbackIfPresent a userspace implementation of yield_for is unable to tell if the registers after the yield_for syscall returns are valid or not, without some other tracking state or program knowledge. Marking as blocked to resolve this, because that seems like a problematic interface we do not want.
|
@tock/core-wg This is definitely P-Significant, and I marked it so, so we need another approval. Having said that, I think this is pretty good to go. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggested adding a few comments, as it was not clear for me when reading the code that the registered callback will not be called. I had to read the TRD for this. I think this will make code understanding easier.
Co-authored-by: Alexandru Radovici <[email protected]>
Co-authored-by: Pat Pannuto <[email protected]>
|
@alexandruradovici Comments adapted. I think all good. In general, I don't love the structure of the |
|
How did we merge a significant PR in 3 days with only 2 approvals? |
|
Did anyone else test this? I ran exactly one test on one board with one process. And nothing on riscv. |
|
We need 2 approvals for a p-significants |
Can we write this down somewhere? That is not my understanding nor what our documentation says
|
|
I guess this PR is a bit of a grey area because it was a draft for a year, but it wasn't clear how close that draft was to mergable, and core members should have a week to look at significant PRs. After a week 2 approvals makes sense. |
|
Ok, sorry. My sense was that it had sat since your rebase and testing, I confirmed with similar testing, and there has been for a while either consensus or apathy about the design (I think I'm the most dissenting voice about the design specifically). And we'll do more testing for release and now that it's merged. We can/should clarify that language. Maybe your interpretation is right and I was wrong, maybe my interpretation was right, but either way it's not clear |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also update the draft version of this TRD in a follow-up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created a PR: #4032
I don't know whether requiring approvals from all core team members is feasible in practice, but we should definitely allow for more time from the point where reviews from the WG are requested, to when someone is able to hit merge. |
…hing` `Queue`'s `dequeue_specific` was renamed to `remove_first` in ab293c5 ("Rename dequeue_specific to remove_first"). While I don't disagree with the fact that `dequeue_specific` is an oxymoron and a bad name, arguably `remove_first` carries its own baggage. We're talking about collections where "first" has meaning, often used interchangably with "head". Looking over the code in tock#3577 I was quite surprised with the complexity of this method's implementation in `RingBuffer`, for what I thought would just be a `dequeue` operation---removing the _first_ / _head_ element. This is an effort to clear up that confusion. This is not a hill worth dying on for me, so if anyone has strong feelings otherwise, feel free to close this.
It's also still an |
…hing` `Queue`'s `dequeue_specific` was renamed to `remove_first` in ab293c5 ("Rename dequeue_specific to remove_first"). While I don't disagree with the fact that `dequeue_specific` is an oxymoron and a bad name, arguably `remove_first` carries its own baggage. We're talking about collections where "first" has meaning, often used interchangably with "head". Looking over the code in tock#3577 I was quite surprised with the complexity of this method's implementation in `RingBuffer`, for what I thought would just be a `dequeue` operation---removing the _first_ / _head_ element. This is an effort to clear up that confusion. This is not a hill worth dying on for me, so if anyone has strong feelings otherwise, feel free to close this.
Pull Request Overview
Following the discussion at TockWorld6, this describes the proposed Yield-WaitFor and provides a (untested) rough implementation of how the kernel could easily implement it.
For ease of viewing, this draft PR edits TRD 104 directly so it can be seen as a diff. A final PR would follow the proper, full TRD process.
The primary motivation to move this functionality from a userspace
yield_forinto a specialized system call is to simplify correctness for userspace applications. Userspace upcall handlers do not have to worry about reentrancy if the kernel guarantees that exactly one and only one specific one of userspace's choosing will be called. It becomes an opt-in synchronous API for userspace without reducing the fundamental asynchronous design of Tock.Testing Strategy
Compiling (and no more!).
TODO or Help Wanted
This is currently designed and architected as a minimal-impact change. In particular, if you want the actual status or return value from the upcall, you still need to have supplied a callback function. If (e.g., often in the case of prints) you don't care about the return success/failure, then you can just leave the default Null Upcall in place and this will do what you want.
The implementation is not intended as final. Rather, it's just trying to demonstrate how this can be a very lightweight change in the kernel (indeed, one should not write careful code while also trying to participate in a meeting 😅 ).
Documentation Updated
/docs, or no updates are required.Formatting
make prepush.Rendered