-
Notifications
You must be signed in to change notification settings - Fork 28
Make unbounded sends unbounded and preserve events order in network #178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
acf669e to
170da04
Compare
…s order on the same actor
170da04 to
348f364
Compare
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
| //! ┌───────────────────────┬────┬─────────────────────┐ | ||
| //! │ size of whole frame │ 32 │ │ | ||
| //! ├───────────────────────┼────┤ │ | ||
| //! │ flags │ 4 │ │ flags: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, add a flag in this comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, the flag seems to be useless according to https://github.com/elfo-rs/elfo/pull/178/changes#r2645396209.
It will be here eventually, but in the current implementation, it's a bit confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but in the current implementation, it's a bit confusing
Why so? First of all, it's only logical for unbounded sends to be unbounded, honestly, before diving into more of elfo-network, I thought that unbounded sends over the network are also unbounded, but realization that they aren't, brought more confusion 🤷. The current algorithm somewhat matches what you'll observe locally - bounded sends will also postpone succeeding unbounded sends.
| return None; | ||
| }; | ||
|
|
||
| if let Err((token, envelope)) = flow.try_enqueue_response(token, envelope) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the composition of all this code. Even calling object.respond in make_envelope() smells bad, but enqueing is much worse because other points in the following code (in do_handle_message). It makes it more error-prone (logic added later in do_handle_message will be forgotten here).
| } | ||
|
|
||
| async fn push(&self, event: RxFlowEvent, routed: bool) -> bool { | ||
| // Sadly we live in rust, `EbrGuard: !Send`, thus writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EbrGuard: !Send is the main reason why fast and safe EBR is possible =)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment here is more about how much code we need to move around to not accidentally capture EbrGuard, which will make returned Future !Send. But it doesn't quite matter, it's just a joke, I can remove it if you consider it inappropriate in the codebase.
| let guard = EbrGuard::new(); | ||
| let object = ward!(self.ctx.book().get(self.actor_addr, &guard), return false); | ||
|
|
||
| object.unbounded_send(Addr::NULL, envelope) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I don't see what's the point to do it here (and in do_handle_message).
Any boundedly sent message before this one will postpone this code anyway.
It seems that the only difference between calling unbounded_send vs send here is inconsiderable.
I do undestand why it would be helpful if we would consider senders' address (in order to preserve guarantees of ctx.send(A); ctx.unbounded_send(B); on the sender side).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way push is written here is simply the way that will lead to faster pusher queue unclogging. The fact that any bounded send postpone succeeding unbounded sends exactly makes the original problem more visible, isn't it? It'll just increase latencies (aka "the time message stays in pusher queue") for unbounded sends, which will kinda ruin author's intended concurrency, since now every unbounded send take someone's place in the mailbox.
IMO, it's reasonable to have this distinction, since we buffer messages on the receiver side - that makes pusher queue clog more of a problem, since unbounded sends will at least have a chance to be handled, on the contrast with them staying in the pusher queue longer 🤷.
| #[derive(Debug)] | ||
| pub(crate) struct NetworkEnvelope { | ||
| pub(crate) sender: NetworkAddr, | ||
| pub(crate) bounded: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move it below somewhere not between related fields
| } | ||
|
|
||
| #[derive(Debug)] | ||
| pub(super) enum RxFlowEvent { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer avoid using "Event" because events are messages (on a par with commands).
The "Message" variant is also confusing because "Response" is also message. Do you mean regular and requests here, right?
| }, | ||
| routed, | ||
| )); | ||
| self.acquire_direct(!routed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what's happening with flow control in these methods and why it's different
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it called here and not on caller's side as acquire_routed?
Don't you think it's confusing to have several places responsible to call acquire_routed and acquire_direct?
| () = pruned.request_to(master, Ack).resolve().await.unwrap(); | ||
| _ = pruned.unbounded_send_to(pruned.addr(), TheResponse); | ||
| })); | ||
| // ^^ Pusher([MasterFill, BeforeResponse, Respond(TheResponse)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's wrong, TheResponse never reaches the pusher's queue, only mailbox
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I just mixed up the Ack and TheResponse here, it should be actually Respond(Ack) instead
|
I think there is problem with suggested solution. Let's consider the following example: // A
Start => {
ctx.send_to(B, StartSpamming).await;
ctx.request_to(C, FetchData).await
}
// B
StartSpamming => {
loop {
ctx.send_to(A, SpamMessage).await;
}
}
// C
(FetchData, token) => {
ctx.respond(token, Data);
}Now, locally or remotely, it never freezes. With the suggested patch it most likely will freeze if A and C and located on different nodes. |
Currently elfo-network screws order of events when destination mailbox is full and responses are involved:
The response could arrive before
Firstmessage gets into target's mailbox.While it's mostly unnoticeable and even correct (?, probably we need to specify explicitly guarantees made by elfo), this can make various pattern hard-to-impossible to implement.
Additionally, this PR introduces
UNBOUNDEDflag toNetworkEnvelopeand makes use of it to send envelopes unboundedly, previous implementation always sent boundedly. This change is backward and forward compatible.