-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
[Messenger] error in receiver results in message staying in queue forever #32055
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@Tobion I'm not sure if this fully explains/covers your situation, but a serializer is supposed to throw a
Then, receivers/transports are supposed to catch this and reject the message -
This is something that didn't happen in 4.2, so it was fixed in 4.3. It does, however, rely on two "shoulds" that are mentioned on two interfaces (the serializer "should" throw that exception and the receiver "should" catch it and reject). Is this your issue? Do you have a custom serializer or transport that's not following these "shoulds"? Update: And we decided to "reject" instead of retry as a deserialization error is not one that seems "temporary" |
@weaverryan you are right. Our custom serializer didn't throw a
https://www.rabbitmq.com/confirms.html
So we should also handle retry for those.
|
Ok, let's break this down so we can see what actionable things we can do.
That seems reasonable... we would basically "fail" in the same way that an exception from a handler. So, by default, it would fail 3 times, then go to the failure queue. This also would solve item (3) I believe: if exceptions from deserialization are handled the same as exceptions from handlers, then the worker would not exit in both situations.
I'd appreciate a separate issue or PR on this... as I'm still far from a RabbitMQ expert. I don't quite understand the flow/problem: A) Handler takes longer than Rabbit connections heartbeat or timeout I'm fuzzy about a few things:
Cheers! |
Yes the flow is described correctly. Trying to answer your questions:
|
To keep this bumping, I think I see two actionable things: A) On deserialization errors ( B) Handle Correct? |
We just had a different case where the messsage get's redelivered again and again blocking the queue.
So it get's redelivered by rabbitmq and fails again and again... Maybe we can put a try...finally around the worker event to make sure the message is rejected at the end.
|
…e is dropped (Tobion) This PR was merged into the 4.3 branch. Discussion ---------- Revert "[Messenger] Fix exception message of failed message is dropped | Q | A | ------------- | --- | Branch? | 4.3 | Bug fix? | yes | New feature? | no <!-- please update src/**/CHANGELOG.md files --> | Deprecations? | no <!-- please update UPGRADE-*.md and src/**/CHANGELOG.md files --> | Tickets | | License | MIT | Doc PR | This reverts #33600 because it makes the message grow for each retry until AMQP cannot handle it anymore. On each retry, the full exception trace is added to the message. So in our case on the 5th retry, the message is too big for the AMQP library to encode it. AMQP extension then throws the exception > Library error: table too large for buffer (ref. alanxz/rabbitmq-c#224 and php-amqp/php-amqp#131) when trying to publish the message. To solve this, I suggest to revert #33600 (this PR) and merge #32341 instead which does not re-add the exception on each failure. Btw, the above problem causes other problematic side-effects of Symfony messenger. As the new retry message fails to be published with an exception, the old (currently processed message) also does not get removed (acknowledged) from the delay queue. So rabbitmq redelivers the message and the same thing happens forever. This can block the consumers and have a huge toll on your service. That's just another case for #32055 (comment). I'll try to fix this in another PR. Commits ------- 3dbe924 Revert "[Messenger] Fix exception message of failed message is dropped on retry"
I fixed problem B) in #34107 Feature A) (sent deserialization errors to the failure transport) is a nice-to-have but also not straight forward because the sending to failure transport relies on \Symfony\Component\Messenger\Event\WorkerMessageFailedEvent which requires an evelope. But when the deserialization fails, you obviously have no envelope to use. So let's keep that separate and does not have much prio to me. |
…queues (Tobion) This PR was merged into the 4.3 branch. Discussion ---------- [Messenger] prevent infinite redelivery loops and blocked queues | Q | A | ------------- | --- | Branch? | 4.3 | Bug fix? | yes | New feature? | no <!-- please update src/**/CHANGELOG.md files --> | Deprecations? | no <!-- please update UPGRADE-*.md and src/**/CHANGELOG.md files --> | Tickets | Fix #32055 | License | MIT | Doc PR | This PR solves a very common fitfall of amqp redeliveries. It's for example explained in https://blog.forma-pro.com/rabbitmq-redelivery-pitfalls-440e0347f4e0 Newer RabbitMQ versions provide a solution for this by itself but only for quorum queues and not the classic ones, see rabbitmq/rabbitmq-server#1889 This PR adds a middleware that throws a RejectRedeliveredMessageException when a message is detected that has been redelivered by AMQP. The middleware runs before the HandleMessageMiddleware and prevents redelivered messages from being handled directly. The thrown exception is caught by the worker and will trigger the retry logic according to the retry strategy. AMQP redelivers messages when they do not get acknowledged or rejected. This can happen when the connection times out or an exception is thrown before acknowledging or rejecting. When such errors happen again while handling the redelivered message, the message would get redelivered again and again. The purpose of this middleware is to prevent infinite redelivery loops and to unblock the queue by republishing the redelivered messages as retries with a retry limit and potential delay. Commits ------- d211904 [Messenger] prevent infinite redelivery loops and blocked queues
When an error in the worker happens before the message is dispatched on the bus (deserialization error of a message for example), the message is neither ack, nor nack, the exception not caught and the worker quit.
The result is that the message stays in the queue (at the front). So the next worker will try to receive the same message again and likely fail again. This continues forever and you can't consume any good messages anymore.
A similar situation can happen when the handling of a message takes longer than the rabbitmq connection heartbeat or timeout. Ref. #31707 and php-enqueue/enqueue-dev#658 (comment)
Rabbitmq puts the messages back into the queue with a Redelivered header. How we solved it in our apps using https://github.com/M6Web/AmqpBundle is to ack the message directly when it has the Redelivered header and then trigger a retry.
I think it's important that SF messenger can handle those cases as well using it's retry logic and failed messages transport.
The text was updated successfully, but these errors were encountered: