-
Couldn't load subscription status.
- Fork 14
Improving fjåge.js timeouts #363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
0d790f4 to
997659a
Compare
|
Re timeouts: It is already quite difficult to keep track of which timeout is being used if I don't specify one myself. This PR is going to make this problem even worse. That's not really an objection to the specific changes here, just voicing out some bigger-picture concerns I have. Re metrics: The new metrics system touches many lines yet its inner workings and intended usage aren't documented very clearly. I'm afraid that over time this will result in quite a bit of code where no one quite understands what it does any more, and in particular the metrics feature itself may break soon if future maintainers (including ourselves) forget to maintain it properly. Maybe we should have a discussion around what problem this feature is trying to solve and what's the best way to achieve that goal. |
4312a90 to
9e45afd
Compare
|
I removed all the bits about metrics. Just the changes to timeouts now.
The idea here is to allow the user of fjage.js to specific the timeout instead using some magic numbers internally. If anything I would suspect this would help make things more clear. The other option would be to force a change of the API and force the user to define timeouts for each transactions. This would break everything though. :( |
9e45afd to
b270bb3
Compare
My concern is a scenario where we get some logs which indicate that a request timed out, and now we have to go figure out what timeout was used for this request. This would be straightforward if we used a single, global, immutable default timeout but can be tricky once you have gateway-level and AgentID-level timeout overrides that can be changed by anyone who has access to the respective objects. |
|
Another approach to solve the arbitrary multipliers on timeouts (which frankly I also don't like) could be this. We expand the Gateway constructor to have 2 options instead : We can set the defaults such that the it works with the fjage tests. A user can set them appropriately for their setup. These are then used as the default timeout for
But each of the the methods also get a This however would break the external API for fjage.js. :( There are ways to make it backward compatible though. The But this part can be considered separately from the part above, since they're not really related. |
I actually prototyped this. It seems default 1000ms for both seems enough for all the tests. Also tried it with https://github.com/org-arl/unetsockets and all of it's tests also pass at 1000ms. |
|
Timeouts are anyway supposed to be a last-resort fix to prevent the system from deadlocking, no? So maybe the solution is to just use the max of all the individual timeouts everywhere?
I've switched to the latter approach in most of my code. Maybe not so much yet in the context of fjage, but definitely in contexts like when you have a multi-layer simulation code that depends on many parameters. It's a bit tedious to spell out at every layer all the parameters that are relevant for that layer, but it makes tracing the origin of parameter values much easier. Also, when you want to change a parameter somewhere, explicit parameters force you to go and propagate that change throughout the entire stack, which seems annoyingly tedious at first but is actually a good thing because it requires you to explicitly go through everything that could potentially have been broken by that change. My experience has been that ultimately that's the less frustrating process compared to making the change and praying really hard that everything else will just work. So the TL;DR is, I'm actually quite inclined towards passing through explicitly all the timeouts to the extent that that's needed (see my earlier point).
I'm assuming that's fully locally on your fairly performant laptop, though, right? There's no guarantee that by changing the default timeouts we won't break things all over the place once we push this to less performant and distributed systems, no? |
Good point. I was looking at reducing them, but all that does is reduces latency of catching and dealing with errors. So instead we can set the default to something like 10000ms and then we don't have to do any of these multipliers etc. I'll update the PR accordingly.
The issue here is always forgetting it in one place. But let's leave this change aside for now. We can revisit it later if we need it. |
b270bb3 to
f267b21
Compare
…tsForService, etc)
f267b21 to
1cc6e98
Compare
Yeah, to optimise for this particularly purpose the timeout argument would have to be mandatory. But that's of course breaking and annoying, so definitely not proposing this seriously. This whole discussion is just a tangent to this PR anyway, so happy to leave this for now as you suggested. |
This pull request improves timeouts available of various Gateway and AgentId methods in fjage.js.
Gateway Improvements
new Gateway({timeout: 1000})which set the default timeout for all requests including the fjage container level queries likeagentForService,agents, etc. This is still supported, but we now add the ability to set a timeout for each request individually using an argument toagentForService,agentsForService,agentsandcontainsAgentmethods.gw.agents(10000)will set the timeout for that request to 10 seconds instead of the default 8 seconds. You can even set it to -1 to disable the timeout for that request, e.g.gw.agents(-1)will wait indefinitely for the response.AgentID Improvements
AgentIdmethods (aid.get(),aid.set(),aid.request(), etc.) to default to the Gateway's timeout. Previously, this was set to 1000ms forrequestand 5000ms forsetandget. By copying from the Gateway's timeout and multiplying it accordingly to the previous values, we keep the default behavior consistent.setTimeoutmethod to the AgentId class. This allows you to set a default timeout for all requests sent to that AgentId.