|
| 1 | +# ZeroRPC Protocol |
| 2 | + |
| 3 | +THIS DOCUMENT IS INCOMPLETE, WORK IN PROGRESS! |
| 4 | + |
| 5 | +This document attempt to define zerorpc's protocol. We will see how the protocol |
| 6 | +can be divided in different layers. |
| 7 | + |
| 8 | +Note that many part of the protocol are either not elegant, could be more |
| 9 | +efficient, or just "curious". This is because zerorpc started as a little tool |
| 10 | +to solve a little problem. Before anybody knew, it was the core of all |
| 11 | +inter-services communications at dotCloud (<http://www.dotcloud.com>). |
| 12 | + |
| 13 | +Keep in mind that ZeroRPC keeps evolving, as we use it to solve new problems |
| 14 | +everyday. There is no doubt that all discrepancy in the protocol will |
| 15 | +disappear (slowly) over time, but backward compatibly is a bitch ;) |
| 16 | + |
| 17 | +> The python implementation of zerorpc act as a reference for the whole |
| 18 | +> protocol. New features and experiments are implemented and tested in this |
| 19 | +> version first. This is also this implementation that is powering dotCloud's |
| 20 | +> infrastructure. |
| 21 | +
|
| 22 | +Before diving into any details, let's divide ZeroRPC's protocol in three |
| 23 | +different layers: |
| 24 | + |
| 25 | + - wire (or transport) layer, a combination of ZMQ <http://www.zeromq.org/> |
| 26 | + and msgpack <http://msgpack.org/>. |
| 27 | + - event (or message) layer, the complex part handling heartbeat, multiplexing |
| 28 | + and the notion of events. |
| 29 | + - "RPC" layer, where you can find the notion of request/response for example. |
| 30 | + |
| 31 | +## Wire layer |
| 32 | + |
| 33 | +The wire layer is a combination of ZMQ and msgpack. |
| 34 | + |
| 35 | +Here's the basics: |
| 36 | + |
| 37 | + - A zerorpc Server can listen on many ZMQ socket as wished. |
| 38 | + - A zerorpc Client can connect on many zerorpc Server as needed, but should |
| 39 | + create a new ZMQ socket for each connection. It is assumed that every servers |
| 40 | + share the same API. |
| 41 | + |
| 42 | +Because zerorpc make use of heartbeat and other streaming-like features, all of |
| 43 | +that multiplexed on top of one ZMQ socket, we can't use any of the |
| 44 | +load-balancing features of ZMQ (ZMQ let choose between round-robin balancing or |
| 45 | +routing, but not both). It also mean that a zerorpc Client can't listen either, |
| 46 | +it need to connect to the server, not the opposite (but really, it shouldn't |
| 47 | +affect you too much). |
| 48 | + |
| 49 | +In short: Each connection from one zerorpc Client require its own ZMQ socket. |
| 50 | + |
| 51 | +> Note that the current implementation of zerorpc for python doesn't implement |
| 52 | +> its own load-balancing (yet), and still use one ZMQ socket for connecting to |
| 53 | +> many servers. You can still use ZMQ load-balancing if you accept to disable |
| 54 | +> heartbeat and forget using any streaming responses. |
| 55 | +
|
| 56 | +Every event from the event layer will be serialized with msgpack. |
| 57 | + |
| 58 | +## Event layer |
| 59 | + |
| 60 | +The event layer is the most complex of all three layers. (And it is where lie |
| 61 | +the majority of the code for the python implementation). |
| 62 | + |
| 63 | +This layer provide: |
| 64 | + |
| 65 | + - The basics events mechanics. |
| 66 | + - A multiplexed channels system, to allow concurrency. |
| 67 | + |
| 68 | +### Event mechanics |
| 69 | + |
| 70 | +An event is a tuple (or array in json), containing in the following order: |
| 71 | + |
| 72 | + 1. the event's header -> dictionary (or object in json) |
| 73 | + 2. the event's names -> string |
| 74 | + 3. the event's arguments -> Can be any valid json, but in practice, its often a |
| 75 | + tuple, even empty, for some odd backward compability reasons. |
| 76 | + |
| 77 | +All events' header must contain an unique message id and the protocol version: |
| 78 | + |
| 79 | + { |
| 80 | + "message_id": "6ce9503a-bfb8-486a-ac79-e2ed225ace79", |
| 81 | + v: 3 |
| 82 | + } |
| 83 | + |
| 84 | +The message id should be unique for the lifetime of the connection between a |
| 85 | +client and a server. |
| 86 | + |
| 87 | +> It doest need to be an UUID, but again for some backward compatibility reason |
| 88 | +> at dotCloud, we are keeping it UUID style, even if we technically don't |
| 89 | +> generate an UUID for every new events anymore, but only a part of it to keep |
| 90 | +> it fast. |
| 91 | +
|
| 92 | +This document talk only about the version 3 of the protocol. |
| 93 | + |
| 94 | +> The python's implementation has lots of weird pieces of code to handle all |
| 95 | +> three versions of the protocol running at dotCloud. |
| 96 | +
|
| 97 | +### Multiplexed channels |
| 98 | + |
| 99 | + - Every new events open a new channel implicitly. |
| 100 | + - The id of the new event will represent the channel id for the connection. |
| 101 | + - Every consecutive events on a channel will have the header field "reply_to" |
| 102 | + set to the channel id: |
| 103 | + |
| 104 | + { |
| 105 | + "message_id": "6ce9503a-bfb8-486a-ac79-e2ed225ace79", |
| 106 | + "reply_to": 6636fb60-2bca-4ecb-8cb4-bbaaf174e9e6 |
| 107 | + } |
| 108 | + |
| 109 | +#### Heartbeat |
| 110 | + |
| 111 | +There is one active heartbeat for every active channel. |
| 112 | + |
| 113 | +> At some point it will be changed to a saner one heartbeat per connection only: |
| 114 | +> backward compatibility says hi again. |
| 115 | +
|
| 116 | + - Event's name: '\_zpc\_hb' |
| 117 | + - Event's args: null |
| 118 | + |
| 119 | +Default heartbeat frequency: 5 seconds. |
| 120 | + |
| 121 | +The remote part of a channel is considered lost when no heartbeat event is |
| 122 | +received after twice the heartbeat frequency (so 10s by default). |
| 123 | + |
| 124 | +> The python implementation raise the exception LostRemote, and even |
| 125 | +> manage to cancel a long-running task on a LostRemote). |
| 126 | +
|
| 127 | +#### Buffering (or congestion control) on channels |
| 128 | + |
| 129 | +Both sides have a buffer for incoming messages on a channel. |
| 130 | + |
| 131 | + - Event's name: '\_zpc\_more' |
| 132 | + - Event's args: integer representing how many entries are available in the client's buffer. |
| 133 | + |
| 134 | +WIP |
| 135 | + |
| 136 | +## Pattern layer |
| 137 | + |
| 138 | +WIP |
| 139 | + |
| 140 | +Request: |
| 141 | + |
| 142 | + - Event's name: string with the name of the method to call. |
| 143 | + - Event's args: tuple of arguments for the method. |
| 144 | + |
| 145 | +Response: |
| 146 | + |
| 147 | + - Event's name: string "OK" |
| 148 | + - Event's args: tuple containing the returned value |
| 149 | + |
| 150 | +> Note that if the method return a tuple, it is still wrapped inside a tuple |
| 151 | +> to contain the returned tuple (backward compatibility man!). |
| 152 | +
|
| 153 | +In case of any fatal errors (an exception or the method's name requested can't |
| 154 | +be found): |
| 155 | + |
| 156 | + - Event's name: string "ERR" |
| 157 | + - Event's args: tuple of 3 strings: |
| 158 | + - Name of the error (Exception's class, or a meanfull word for example). |
| 159 | + - Human representation of the error (prefer english please). |
| 160 | + - If possible a pretty printed traceback of the call stack when the error occured. |
| 161 | + |
| 162 | +### Default methods |
| 163 | + |
| 164 | +WIP |
| 165 | + |
| 166 | + - \_zerorpc\_ping |
| 167 | + - \_zerorpc\_inspect |
| 168 | + |
| 169 | +### Request/Stream |
| 170 | + |
| 171 | +Response: |
| 172 | + |
| 173 | + - Event's name: string "STREAM" |
| 174 | + - Event's args: tuple containing the streamed value |
| 175 | + |
| 176 | +When the STREAM reach it's end: |
| 177 | + |
| 178 | + - Event's name: string "STREAM\_DONE" |
| 179 | + - Event's args: null |
| 180 | + |
| 181 | +> The python's implementation represent a stream by an iterator on both sides. |
| 182 | +
|
| 183 | +-- END OF DOCUMENT -- |
0 commit comments