-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
[Scheduler] Intermittent Runs #51646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Difficult to debug indeed...
Hopefully we can confirm it is the caching issue. And if so, #51384 seems like it would be related. |
Thanks for taking the time to respond!
It's unlikely, we don't have anything currently set up to clear the cache really, not on deploy or anything... it would take manual action.
That's a good suggestion, I think it's fairly unlikely anyway that the ECS task won't be running, so it should not affect testing this. We'll give it a try and see how it goes. As you said, this might take a week or two before we've useful results, but I'll come back to this ticket when I have enough data to share. Thanks again. |
Hi - thanks for the suggestion. Unfortunately I'm not sure it'll be possible to include your PR in our build process to test it, I'll let you know if I am able to though. |
If you confirm the problem is the cache as suggested above, hopefully that will heavily imply that #51651 is the fix. |
This bug has been fixed in 6.4, not 6.3, as we had to introduce a BC break to fix it properly. |
Without cache, there are several possibilities where a run may be skipped, for example:
If your case falls under any of those scenarios, I don't think it's a bug in the scheduler component. If you're sure that there was a scheduler worker alive at the time when the run was meant to happen, then there could be an issue indeed 🤔 |
You're absolutely correct @valtzu, I did note in my previous comment that while there is a small chance the worker might not be running I thought it unlikely enough to not affect these results. But it seems from our logs that in this case the worker happened to stop 19 seconds before the scheduled run, and then not restart until 20 seconds after it. 🤦 I'll continue monitoring and return here if an issue does seem to exist. |
Symfony version(s) affected
6.3.4
Description
When using the scheduler with a single daily recurring message we are often seeing days where the scheduler appears to have not run.
Here is the schedule config:
We have configured caching for the scheduler provider using the database, to allow it to catchup if the task was to crash. The servers run in UTC.
We have added some middleware to attempt to give some visibility to scheduler activity, which just logs when each message is dispatched on the schedule.
This is then run in production alongside our other Messenger transports (eg.
async another_transport
) with an ECS task.As you can see from the logs it appears some days are missing (ie. 10th and 7th in this example).
I know this is a tricky one, as I don't have an exact reproduction, or really know where to start debugging this. I'm hoping in raising an issue perhaps there's something obvious which is incorrectly configured, or someone can indicate an area to look at to attempt to debug or reproduce this.
How to reproduce
As indicated above I do not have a reproduction, running the scheduler locally on shorter timeframes seems to behave as intended, and debugging a daily job is hard so I'm hoping for some pointers before beginning on that.
Possible Solution
Unsure if this is related: #51384
Additional Context
The text was updated successfully, but these errors were encountered: