@@ -245,6 +245,120 @@ def __getattr__(self, method):
245
245
return lambda * args , ** kargs : self (method , * args , ** kargs )
246
246
247
247
248
+ class Worker (object ):
249
+ """
250
+ Connect to a Switch as a worker, give: declare its service_name,
251
+ nickname, hostname, processid, stuff like that as well as how much
252
+ requests he is willing to handle. If the Worker want more work, it can
253
+ periodically request more work in this way.
254
+
255
+ A simple worker would say: i can accept one job, then request another
256
+ one.
257
+
258
+ To reduce latencies, another worker could say: I can accept 100 request.
259
+ After receiving 50 requests, it can say hey I can accept 50 more
260
+ requests. Thus keeping the number of requests as high as possible on the
261
+ Switch.
262
+
263
+ (would that make any sense to reuse buffered channel?)
264
+
265
+
266
+ Each connection == a new zmq socket.
267
+ """
268
+ pass
269
+
270
+
271
+ class Switch (object ):
272
+ """
273
+ Bind a port and wait for both clients and workers.
274
+
275
+ When a worker declare itself, add it the list of workers available under
276
+ the worker service's name.
277
+
278
+ So for each service's name, there is set of workers (zmqid as a key, +
279
+ all the meta infos).
280
+
281
+ Each workers as a counter of how much request you can send. As soon as a
282
+ request is forwarded, this value is decremented. When its 0, well, you
283
+ simply don't talk to this worker anymore until the value bump again (or a
284
+ new worker arrive to help).
285
+
286
+ When selecting which worker to forward a requests to, simply take the
287
+ one with the biggest reserve of requests.
288
+
289
+ When a worker declare itself to the Switch, a heartbeat is running.
290
+
291
+ If the heartbeat is lost, the worker is considered faulty (requests
292
+ reset to 0, and some status updated for the concerned worker).
293
+
294
+ Workers at 0 requests available for more than 7 days by default are
295
+ removed.
296
+
297
+ When a client connect, the client declare for which service it want to
298
+ talks to. It also provide its name, hostname, pid... and other little
299
+ intimate details for debugging purpose.
300
+
301
+ An heartbeat is kept between the client and the switch. If the heartbeat
302
+ is lost, then the switch cleanup its information about the client. For
303
+ debugging purpose, the switch keep the details about dead client for an
304
+ hour but no more than 500 of them.
305
+
306
+ Thereafter, each time the client send a request, it forward it to the
307
+ next available worker for which the client registered.
308
+
309
+ If a client send a request without registering, its then talking
310
+ directly to the Switch itself. The switch can expose whatever shit it
311
+ wants.
312
+
313
+ Also the problem is that there is heartbeats between workers/client and
314
+ the switch. But also heartbeat between the client and the worker when a
315
+ channel is opened. So if heartbeats are stacked, then you never
316
+ know which was dropped.
317
+
318
+ The switch have to manipulate switching tables for channels between
319
+ clients and workers. A simple LRU with a TTL per entries of 3 times a
320
+ heartbeat should be enough.
321
+
322
+ Technical note:
323
+ I feel like client/worker registration shouldn't be implemented as a
324
+ zerorpc context. On the worker side, it wouldn't be too much a
325
+ problem, after all, if you get disconnected, reconnect, and
326
+ re-register. But on the client side, it would be really annoying to
327
+ get an heartbeat exception, trying to guess if it come from a loss
328
+ of connectivity with the switch or the worker at the end, and in
329
+ case of switch loss, having to re-request a context to the switch
330
+ and make sure that every code use the new context. The context would
331
+ have to be encapsulated anyway.
332
+
333
+ Thinking about it, maybe it would be great in fact. But heartbeat
334
+ failure need to be redesigned to be handled in a better way. For
335
+ example a channel should be closed in case of heartbeat error. This
336
+ part of the code need to be re-designed. Then SwitchClient would
337
+ just be a little convenience wrapper. It could even be possible to
338
+ add an optional parameter when opening a context to ask zerorpc to
339
+ generate an 'auto reconnect' context. Meaning that the context would
340
+ simply re-play the request whenever the connection is dropped (that
341
+ sound really exciting in fact).
342
+
343
+ """
344
+ pass
345
+
346
+
347
+ class SwitchClient (object ):
348
+ """
349
+ Connect to a Switch as a client, registering to talk to a give service.
350
+
351
+ When connecting, register its little info and declare for which service
352
+ it want to talk to.
353
+
354
+ If the heartbeat with the switch drop, re-register again and again.
355
+
356
+ When waiting for successful re-registration, every requests should be
357
+ blocked (thus everything like timeout & heartbeat can still kick in).
358
+ """
359
+ pass
360
+
361
+
248
362
class Server (SocketBase , ServerBase ):
249
363
250
364
def __init__ (self , methods = None , name = None , context = None , pool_size = None ,
0 commit comments