Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@notthetup
Copy link
Collaborator

@notthetup notthetup commented Jul 28, 2025

This pull request improves timeouts available of various Gateway and AgentId methods in fjage.js.

Gateway Improvements

  • Previously the Gateway constructor had an option argument new Gateway({timeout: 1000}) which set the default timeout for all requests including the fjage container level queries like agentForService, agents, etc. This is still supported, but we now add the ability to set a timeout for each request individually using an argument to agentForService, agentsForService, agents and containsAgent methods.
  • The default timeout remain the same (the default is 8 times the Gateway's timeout, which itself was 1000ms by default).
  • With the new argument, you can now specify a different timeout for each request, e.g. gw.agents(10000) will set the timeout for that request to 10 seconds instead of the default 8 seconds. You can even set it to -1 to disable the timeout for that request, e.g. gw.agents(-1) will wait indefinitely for the response.

AgentID Improvements

  • We also default the timeout for all requests sent to an Agent using the AgentId methods (aid.get(), aid.set(), aid.request(), etc.) to default to the Gateway's timeout. Previously, this was set to 1000ms for request and 5000ms for set and get. By copying from the Gateway's timeout and multiplying it accordingly to the previous values, we keep the default behavior consistent.
  • We now add the ability to change the default timeout for any requests sent to a specific Agent by adding a setTimeout method to the AgentId class. This allows you to set a default timeout for all requests sent to that AgentId.

@notthetup notthetup changed the title chore(fjagejs): refactoring to split various functionality into indiv… Improving fjåge.js timeouts Jul 28, 2025
@notthetup notthetup force-pushed the fjage-js-timeout-improvements branch from 0d790f4 to 997659a Compare July 28, 2025 09:33
@ettersi
Copy link
Collaborator

ettersi commented Jul 29, 2025

Re timeouts: It is already quite difficult to keep track of which timeout is being used if I don't specify one myself. This PR is going to make this problem even worse. That's not really an objection to the specific changes here, just voicing out some bigger-picture concerns I have.

Re metrics: The new metrics system touches many lines yet its inner workings and intended usage aren't documented very clearly. I'm afraid that over time this will result in quite a bit of code where no one quite understands what it does any more, and in particular the metrics feature itself may break soon if future maintainers (including ourselves) forget to maintain it properly. Maybe we should have a discussion around what problem this feature is trying to solve and what's the best way to achieve that goal.

@notthetup notthetup force-pushed the fjage-js-timeout-improvements branch 5 times, most recently from 4312a90 to 9e45afd Compare July 30, 2025 09:45
@notthetup notthetup self-assigned this Jul 30, 2025
@notthetup notthetup requested a review from ettersi July 30, 2025 09:57
@notthetup
Copy link
Collaborator Author

notthetup commented Jul 30, 2025

I removed all the bits about metrics. Just the changes to timeouts now.

It is already quite difficult to keep track of which timeout is being used if I don't specify one myself. This PR is going to make this problem even worse. That's not really an objection to the specific changes here, just voicing out some bigger-picture concerns I have.

The idea here is to allow the user of fjage.js to specific the timeout instead using some magic numbers internally. If anything I would suspect this would help make things more clear.

The other option would be to force a change of the API and force the user to define timeouts for each transactions. This would break everything though. :(

@notthetup notthetup marked this pull request as ready for review July 30, 2025 09:57
@notthetup notthetup force-pushed the fjage-js-timeout-improvements branch from 9e45afd to b270bb3 Compare July 30, 2025 09:58
@ettersi
Copy link
Collaborator

ettersi commented Jul 31, 2025

It is already quite difficult to keep track of which timeout is being used if I don't specify one myself. This PR is going to make this problem even worse. That's not really an objection to the specific changes here, just voicing out some bigger-picture concerns I have.

The idea here is to allow the user of fjage.js to specific the timeout instead using some magic numbers internally. If anything I would suspect this would help make things more clear.

My concern is a scenario where we get some logs which indicate that a request timed out, and now we have to go figure out what timeout was used for this request. This would be straightforward if we used a single, global, immutable default timeout but can be tricky once you have gateway-level and AgentID-level timeout overrides that can be changed by anyone who has access to the respective objects.

@ettersi ettersi self-requested a review July 31, 2025 07:11
@notthetup
Copy link
Collaborator Author

Another approach to solve the arbitrary multipliers on timeouts (which frankly I also don't like) could be this.

We expand the Gateway constructor to have 2 options instead :

new Gateway(..., { containerQueryTimeout: 8000, messageTimeout: 5000 }) 

We can set the defaults such that the it works with the fjage tests. A user can set them appropriately for their setup.

These are then used as the default timeout for

containerQueryTimeout : agents, containsAgent, agentForService and agentsForService
messageTimeout : request

But each of the the methods also get a timeout argument to change away from the default value if required.

This however would break the external API for fjage.js. :( There are ways to make it backward compatible though.


The AgentID level timeout configuration (applicable to aid.set, aid.get and aid.request), is more targeted towards a use case where a known Agent is slow to respond (maybe because it's behind a slow network connection, etc). Being able to set that on an AgentID makes sense instead of changing it at every aid.set, aid.get and aid.request for that AgentID.

But this part can be considered separately from the part above, since they're not really related.

@notthetup
Copy link
Collaborator Author

We can set the defaults such that the it works with the fjage tests. A user can set them appropriately for their setup.

I actually prototyped this. It seems default 1000ms for both seems enough for all the tests. Also tried it with https://github.com/org-arl/unetsockets and all of it's tests also pass at 1000ms.

@ettersi
Copy link
Collaborator

ettersi commented Aug 1, 2025

Timeouts are anyway supposed to be a last-resort fix to prevent the system from deadlocking, no? So maybe the solution is to just use the max of all the individual timeouts everywhere?


The AgentID level timeout configuration (applicable to aid.set, aid.get and aid.request), is more targeted towards a use case where a known Agent is slow to respond (maybe because it's behind a slow network connection, etc). Being able to set that on an AgentID makes sense instead of changing it at every aid.set, aid.get and aid.request for that AgentID.

I've switched to the latter approach in most of my code. Maybe not so much yet in the context of fjage, but definitely in contexts like when you have a multi-layer simulation code that depends on many parameters. It's a bit tedious to spell out at every layer all the parameters that are relevant for that layer, but it makes tracing the origin of parameter values much easier. Also, when you want to change a parameter somewhere, explicit parameters force you to go and propagate that change throughout the entire stack, which seems annoyingly tedious at first but is actually a good thing because it requires you to explicitly go through everything that could potentially have been broken by that change. My experience has been that ultimately that's the less frustrating process compared to making the change and praying really hard that everything else will just work. So the TL;DR is, I'm actually quite inclined towards passing through explicitly all the timeouts to the extent that that's needed (see my earlier point).


I actually prototyped this. It seems default 1000ms for both seems enough for all the tests. Also tried it with https://github.com/org-arl/unetsockets and all of it's tests also pass at 1000ms.

I'm assuming that's fully locally on your fairly performant laptop, though, right? There's no guarantee that by changing the default timeouts we won't break things all over the place once we push this to less performant and distributed systems, no?

@notthetup
Copy link
Collaborator Author

Timeouts are anyway supposed to be a last-resort fix to prevent the system from deadlocking, no? So maybe the solution is to just use the max of all the individual timeouts everywhere?

Good point. I was looking at reducing them, but all that does is reduces latency of catching and dealing with errors. So instead we can set the default to something like 10000ms and then we don't have to do any of these multipliers etc.

I'll update the PR accordingly.


Also, when you want to change a parameter somewhere, explicit parameters force you to go and propagate that change throughout the entire stack, which seems annoyingly tedious at first but is actually a good thing because it requires you to explicitly go through everything that could potentially have been broken by that change.

The issue here is always forgetting it in one place.

But let's leave this change aside for now. We can revisit it later if we need it.

@notthetup notthetup force-pushed the fjage-js-timeout-improvements branch from b270bb3 to f267b21 Compare August 1, 2025 11:13
@notthetup notthetup force-pushed the fjage-js-timeout-improvements branch from f267b21 to 1cc6e98 Compare August 1, 2025 11:14
@notthetup notthetup requested a review from ettersi August 1, 2025 12:41
@ettersi
Copy link
Collaborator

ettersi commented Aug 1, 2025

Also, when you want to change a parameter somewhere, explicit parameters force you to go and propagate that change throughout the entire stack, which seems annoyingly tedious at first but is actually a good thing because it requires you to explicitly go through everything that could potentially have been broken by that change.

The issue here is always forgetting it in one place.

Yeah, to optimise for this particularly purpose the timeout argument would have to be mandatory. But that's of course breaking and annoying, so definitely not proposing this seriously. This whole discussion is just a tangent to this PR anyway, so happy to leave this for now as you suggested.

@notthetup notthetup merged commit 2ea6566 into master Aug 1, 2025
2 checks passed
@notthetup notthetup deleted the fjage-js-timeout-improvements branch August 1, 2025 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants