Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[Messenger] messages lost during graceful shutdown with SQS #45778

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
SanderHagen opened this issue Mar 18, 2022 · 0 comments
Closed

[Messenger] messages lost during graceful shutdown with SQS #45778

SanderHagen opened this issue Mar 18, 2022 · 0 comments

Comments

@SanderHagen
Copy link
Contributor

SanderHagen commented Mar 18, 2022

Symfony version(s) affected

6.0.6

Description

We are using Symfony Messenger in combination with AWS SQS. We deploy a new version of our app each day. So we also need to shutdown the workers of the previous version, that's where things go wrong.

For some reason, after each deploy we see the Approximate Age Of Oldest Message metric in SQS increase until it reaches the visibility timeout of the message. Normally this metric is always 0. This indicates a message is lost somewhere during shutdown and will be processed again. This could lead to messages being processed twice.

How to reproduce

We run messenger with the following stack: AWS ECS > Docker > Supervisor > Messenger

We have setup graceful shutdown properly, see the screenshot. Received SIGTERM signal indicates messenger knows shutdown is inbound and should stop the workers.
image

stopwaitsecs in supervisor has been set to 20 seconds.
The container timeout in AWS is the default 30 seconds.

Possible Solution

I noticed amazon-sqs-messenger has this method https://github.com/symfony/amazon-sqs-messenger/blob/5.4/Transport/Connection.php#L366. That class also keeps a buffer. Yet, that method never appears to be called by messenger. Is that possibly what's happening?

Additional Context

The approximate age of oldest message graph, this happens every deploy:
image

@fabpot fabpot closed this as completed Apr 1, 2022
fabpot added a commit that referenced this issue Apr 1, 2022
This PR was squashed before being merged into the 4.4 branch.

Discussion
----------

[Messenger] reset connection on worker shutdown

| Q             | A
| ------------- | ---
| Branch?       | 4.4
| Bug fix?      | yes
| New feature?  | no
| Deprecations? | no
| Tickets       | Fix #45778
| License       | MIT

As seen in the issue. Amazon SQS transport uses a buffer. Messages can be lost when the buffer contains these messages and some container executing the process is shut down. The connection contains a [reset](https://github.com/symfony/amazon-sqs-messenger/blob/5.4/Transport/Connection.php#L366) method and implements the `ResetInterface`. If this method were to be called on shutdown the messages will be marked as visible again.

Commits
-------

c486305 [Messenger] reset connection on worker shutdown
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants