Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 7c8642b

Browse files
committed
beefed up the doc a little bit
1 parent b2b2ed5 commit 7c8642b

File tree

1 file changed

+128
-80
lines changed

1 file changed

+128
-80
lines changed

doc/protocol.md

Lines changed: 128 additions & 80 deletions
Original file line numberDiff line numberDiff line change
@@ -2,79 +2,94 @@
22

33
THIS DOCUMENT IS INCOMPLETE, WORK IN PROGRESS!
44

5-
This document attempt to define zerorpc's protocol. We will see how the protocol
6-
can be divided in different layers.
5+
This document attempts to define the ZeroRPC protocol.
76

8-
Note that many part of the protocol are either not elegant, could be more
9-
efficient, or just "curious". This is because zerorpc started as a little tool
10-
to solve a little problem. Before anybody knew, it was the core of all
11-
inter-services communications at dotCloud (<http://www.dotcloud.com>).
7+
## Introduction & History
128

13-
Keep in mind that ZeroRPC keeps evolving, as we use it to solve new problems
14-
everyday. There is no doubt that all discrepancy in the protocol will
15-
disappear (slowly) over time, but backward compatibly is a bitch ;)
9+
A short warning: we know that some parts of the protocol are not very elegant.
10+
Some things can certainly be optimized. You will certainly think that some
11+
parts are... weird. This is because ZeroRPC started as a simple tool to
12+
solve a simple problem. And progressively, it became the core of all
13+
inter-services communications at dotCloud (<http://www.dotcloud.com>),
14+
and was refined, improved, enriched, to satisfy the needs of the dotCloud
15+
platform.
16+
17+
Keep in mind that ZeroRPC keeps evolving: we add new features to solve
18+
the new problems that we meet every day.
19+
20+
Of course, we want to weed out the discrepancies and "weird" behaviors
21+
of the protocol; but we also need to keep some backward compability,
22+
and that's painful indeed.
1623

1724
> The python implementation of zerorpc act as a reference for the whole
1825
> protocol. New features and experiments are implemented and tested in this
1926
> version first. This is also this implementation that is powering dotCloud's
2027
> infrastructure.
2128
29+
## Layers
30+
2231
Before diving into any details, let's divide ZeroRPC's protocol in three
2332
different layers:
2433

25-
- wire (or transport) layer, a combination of ZMQ <http://www.zeromq.org/>
26-
and msgpack <http://msgpack.org/>.
27-
- event (or message) layer, the complex part handling heartbeat, multiplexing
28-
and the notion of events.
29-
- "RPC" layer, where you can find the notion of request/response for example.
34+
1. Wire (or transport) layer; a combination of ZMQ <http://www.zeromq.org/>
35+
and msgpack <http://msgpack.org/>.
36+
2. Event (or message) layer; this is probably the most complex part, since
37+
it handles heartbeat, multiplexing, and events.
38+
3. RPC layer; that's where you can find the notion of request and response.
3039

3140
## Wire layer
3241

3342
The wire layer is a combination of ZMQ and msgpack.
3443

35-
Here's the basics:
44+
The basics:
3645

37-
- A zerorpc Server can listen on many ZMQ socket as wished.
38-
- A zerorpc Client can connect on many zerorpc Server as needed, but should
39-
create a new ZMQ socket for each connection. It is assumed that every servers
40-
share the same API.
46+
- A ZeroRPC server can listen on as many ZMQ sockets as you like. Actually,
47+
a ZMQ socket can bind to multiple addresses. It can even *connect* to the
48+
clients (think about it as a worker connecting to a hub), but there are
49+
some limitations in that case (see below). ZeroRPC doesn't
50+
have to do anything specific for that: ZMQ handles it automatically.
51+
- A ZeroRPC client can connect to multiple ZeroRPC servers. However, it should
52+
create a new ZMQ socket for each connection.
4153

42-
Because zerorpc make use of heartbeat and other streaming-like features, all of
43-
that multiplexed on top of one ZMQ socket, we can't use any of the
44-
load-balancing features of ZMQ (ZMQ let choose between round-robin balancing or
45-
routing, but not both). It also mean that a zerorpc Client can't listen either,
46-
it need to connect to the server, not the opposite (but really, it shouldn't
47-
affect you too much).
54+
Since ZeroRPC implements heartbeat and streaming, it expects a kind of
55+
persistent, end-to-end, connection between the client and the server.
56+
It means that we cannot use the load-balancing features built into ZMQ.
57+
Otherwise, the various messages composing a single conversation could
58+
end up in different places.
4859

49-
In short: Each connection from one zerorpc Client require its own ZMQ socket.
60+
That's why there are limitations when the server connects to the client:
61+
if there are multiple servers connecting to the same client, bad things
62+
will happen.
5063

51-
> Note that the current implementation of zerorpc for python doesn't implement
52-
> its own load-balancing (yet), and still use one ZMQ socket for connecting to
64+
> Note that the current implementation of ZeroRPC for Python doesn't implement
65+
> its own load-balancing (yet), and still uses one ZMQ socket for connecting to
5366
> many servers. You can still use ZMQ load-balancing if you accept to disable
54-
> heartbeat and forget using any streaming responses.
67+
> heartbeat and don't use streamed responses.
5568
5669
Every event from the event layer will be serialized with msgpack.
5770

71+
5872
## Event layer
5973

60-
The event layer is the most complex of all three layers. (And it is where lie
61-
the majority of the code for the python implementation).
74+
The event layer is the most complex of all three layers. The majority of the
75+
code for the Python implementation deals with this layer.
76+
77+
This layer provides:
6278

63-
This layer provide:
79+
- basic events;
80+
- multiplexed channels, allowing concurrency.
6481

65-
- The basics events mechanics.
66-
- A multiplexed channels system, to allow concurrency.
6782

68-
### Event mechanics
83+
### Basic Events
6984

70-
An event is a tuple (or array in json), containing in the following order:
85+
An event is a tuple (or array in JSON), containing in the following order:
7186

72-
1. the event's header -> dictionary (or object in json)
87+
1. the event's header -> dictionary (or object in JSON)
7388
2. the event's names -> string
74-
3. the event's arguments -> Can be any valid json, but in practice, its often a
75-
tuple, even empty, for some odd backward compability reasons.
89+
3. the event's arguments -> any kind of value; but in practice, for backward
90+
compatibility, it is recommended that this is a tuple (an empty one is OK).
7691

77-
All events' header must contain an unique message id and the protocol version:
92+
All events headers must contain an unique message id and the protocol version:
7893

7994
{
8095
"message_id": "6ce9503a-bfb8-486a-ac79-e2ed225ace79",
@@ -84,100 +99,133 @@ All events' header must contain an unique message id and the protocol version:
8499
The message id should be unique for the lifetime of the connection between a
85100
client and a server.
86101

87-
> It doest need to be an UUID, but again for some backward compatibility reason
88-
> at dotCloud, we are keeping it UUID style, even if we technically don't
89-
> generate an UUID for every new events anymore, but only a part of it to keep
90-
> it fast.
102+
> It doesn't need to be an UUID, but again, for backward compatibility reasons,
103+
> it is better if it follows the UUID format.
91104
92-
This document talk only about the version 3 of the protocol.
105+
This document talks only about the version 3 of the protocol.
93106

94-
> The python's implementation has lots of weird pieces of code to handle all
95-
> three versions of the protocol running at dotCloud.
107+
> The Python implementation has a lot of backward compatibility code, to handle
108+
> communication between all three versions of the protocol.
96109
97-
### Multiplexed channels
98110

99-
- Every new events open a new channel implicitly.
111+
### Multiplexed Channels
112+
113+
- Each new event opens a new channel implicitly.
100114
- The id of the new event will represent the channel id for the connection.
101-
- Every consecutive events on a channel will have the header field "reply_to"
115+
- Each consecutive event on a channel will have the header field "reply_to"
102116
set to the channel id:
103117

104118
{
105119
"message_id": "6ce9503a-bfb8-486a-ac79-e2ed225ace79",
106-
"reply_to": 6636fb60-2bca-4ecb-8cb4-bbaaf174e9e6
120+
"reply_to": "6636fb60-2bca-4ecb-8cb4-bbaaf174e9e6"
107121
}
108122

109123
#### Heartbeat
110124

111-
There is one active heartbeat for every active channel.
125+
Each part of a channel must send a heartbeat at regular intervals.
126+
127+
The default heartbeat frequency is 5 seconds.
128+
129+
> Note that technically, the heartbeat could be sitting on the connection level
130+
> instead of the channel level; but again, backward compatibility requires
131+
> to run it per channel at this point.
112132
113-
> At some point it will be changed to a saner one heartbeat per connection only:
114-
> backward compatibility says hi again.
133+
The heartbeat is defined as follow:
115134

116135
- Event's name: '\_zpc\_hb'
117136
- Event's args: null
118137

119-
Default heartbeat frequency: 5 seconds.
138+
When no heartbeat even is received after 2 heartbeat intervals (so, 10s by default),
139+
we consider that the remote is lost.
120140

121-
The remote part of a channel is considered lost when no heartbeat event is
122-
received after twice the heartbeat frequency (so 10s by default).
123-
124-
> The python implementation raise the exception LostRemote, and even
125-
> manage to cancel a long-running task on a LostRemote).
141+
> The Python implementation raises the LostRemote exception, and even
142+
> manages to cancel a long-running task on a LostRemote. FIXME what does that mean?
126143
127144
#### Buffering (or congestion control) on channels
128145

129-
Both sides have a buffer for incoming messages on a channel.
146+
Both sides have a buffer for incoming messages on a channel. A peer can
147+
send an advisory message to the other end of the channel, to inform it of the
148+
size of its local buffer. This is a hint for the remote, to tell it "send me
149+
more data!"
130150

131151
- Event's name: '\_zpc\_more'
132152
- Event's args: integer representing how many entries are available in the client's buffer.
133153

134-
WIP
154+
FIXME WIP
135155

136-
## RPC layer
156+
## RPC Layer
137157

138-
WIP
158+
In the first version of ZeroRPC, this was the main (and only) layer.
159+
Three kinds of events can occur at this layer: request (=function call),
160+
response (=function return), error (=exception).
139161

140162
Request:
141163

142164
- Event's name: string with the name of the method to call.
143165
- Event's args: tuple of arguments for the method.
144166

167+
Note: keyword arguments are not supported, because some languages don't
168+
support them. If you absolutely want to call functions with keyword
169+
arguments, you can use a wrapper; e.g. expose a function like
170+
"call_with_kwargs(function_name, args, kwargs)", where args is a list,
171+
and kwargs a dict. It might be an interesting idea to add such a
172+
helper function into ZeroRPC default methods (see below for definitions
173+
of existing default methods).
174+
145175
Response:
146176

147177
- Event's name: string "OK"
148178
- Event's args: tuple containing the returned value
149179

150-
> Note that if the method return a tuple, it is still wrapped inside a tuple
151-
> to contain the returned tuple (backward compatibility man!).
180+
> Note that if the return value is a tuple, it is itself wrapped inside a
181+
> tuple - again, for backward compatibility reasons.
152182
153-
In case of any fatal errors (an exception or the method's name requested can't
154-
be found):
183+
FIXME - is [] equivalent to [null]?
184+
185+
If an error occurs (either at the transport level, or if an uncaught
186+
exception is raised), we use the ERR event.
155187

156188
- Event's name: string "ERR"
157189
- Event's args: tuple of 3 strings:
158-
- Name of the error (Exception's class, or a meanfull word for example).
159-
- Human representation of the error (prefer english please).
190+
- Name of the error (it should be the exception class name, or another
191+
meaningful keyword).
192+
- Human representation of the error (preferably in english).
160193
- If possible a pretty printed traceback of the call stack when the error occured.
161194

162-
### Default methods
195+
> A future version of the protocol will probably add a structured version of the
196+
> traceback, allowing machine-to-machine stack walking and better cross-language
197+
> exception representation.
163198
164-
WIP
165199

166-
- \_zerorpc\_ping
167-
- \_zerorpc\_inspect
200+
### Default calls
168201

169-
### Request/Stream
202+
When exposing some code with ZeroRPC, a number of methods/functions are
203+
automatically added, to provide useful debugging and development tools.
170204

171-
Response:
205+
- \_zerorpc\_ping() just answers with a pong message.
206+
- \_zerorpc\_inspect() returns all the available calls, with their
207+
signature and documentation.
208+
209+
FIXME we should rather standardize about the basic introspection calls.
210+
211+
212+
### Streaming
213+
214+
At the protocol level, streaming is straightforward. When a server wants
215+
to stream some data, instead of sending a "OK" message, it sends a "STREAM"
216+
message. The client will know that it needs to keep waiting for more.
217+
At the end of the sream, a "STREAM_DONE" message is expected.
218+
219+
Formal definitions follow.
220+
221+
Messages part of a stream:
172222

173223
- Event's name: string "STREAM"
174224
- Event's args: tuple containing the streamed value
175225

176-
When the STREAM reach it's end:
226+
When the STREAM reaches its end:
177227

178228
- Event's name: string "STREAM\_DONE"
179229
- Event's args: null
180230

181-
> The python's implementation represent a stream by an iterator on both sides.
182-
183-
-- END OF DOCUMENT --
231+
> The Python implementation represents a stream by an iterator on both sides.

0 commit comments

Comments
 (0)