Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Celery not publishing via configured default exchange when task_default_exchange configured - documentation or actual bug #9940

@armurox

Description

@armurox

Checklist

  • I have checked the issues list
    for similar or identical bug reports.
  • I have checked the pull requests list
    for existing proposed fixes.
  • I have checked the commit log
    to find out if the bug was already fixed in the main branch.
  • I have included all related issues and possible duplicate issues in this issue
    (If there are none, check this box anyway).

Related Issues and Possible Duplicates

Related Issues

Possible Duplicates

  • None

Description

Hello! I originally asked this as a question, but did not receive a response. Am putting it as a documentation bug, since it does appear to be either a bug, or missing from the documentation because I am not fully sure if this is intentional or not, but it leads to what I think is quite unintuitive behaviour for Celery (if it is a bug, I can open a PR to fix, but if it is intentional, I would like to know its purpose if possible).

celery-version: 5.5.3
broker: rabbitmq4.1.4 (docker: rabbitmq:4-management)

Minimum reproducible example:

Created one tasks.py file with the below code:

from celery import Celery
import logging

from kombu import Queue
from kombu import Exchange

app = Celery('tasks', broker='pyamqp://guest@localhost//')

default_exchange = Exchange('celery', type='direct')
default_queue = Queue('my-celery-queue', exchange=default_exchange, routing_key='celery', durable=True)
app.conf.task_queues = (default_queue,)

app.conf.task_default_queue = 'my-celery-queue'
app.conf.task_default_exchange = 'celery'
app.conf.task_default_routing_key = 'celery'


logger = logging.getLogger(__name__)

@app.task
def add(x, y):
    logger.info(f'Adding {x} + {y}')
    return x + y

And started up a celery worker. I then ran the following in the ipython shell:

from tasks import add
add.delay(1, 2)

Now, my assumption was that the message would be produced and published, via the default_exchange (i.e. celery) that I had setup and routed to my-celery-queue, however, from checking the rabbitmq dashboard, I noticed that the message was published to it via the default rabbitmq exchange, even though I'd explicitly configured task_default_queue for the app with the default_exchange specified, as per the docs. I also read through the section on routing in the docs, but I didn't see this behaviour specified (apologies if I missed something, please do let me know if I did).

Root cause:

Basically, finally I took a look at the actual celery code, and after some tracing, noticed this line specifically in celery/apps/amqp.py which is:

if (not exchange or not routing_key) and exchange_type == 'direct':
                exchange, routing_key = '', qname

....

# And finally the message is produced with the below:

 producer.publish(
                body,
                exchange=exchange,
                routing_key=routing_key,
                serializer=serializer or default_serializer,
                compression=compression or default_compressor,
                retry=retry, retry_policy=_rp,
                delivery_mode=delivery_mode, declare=declare,
                headers=headers2,
                timeout=timeout, confirm_timeout=confirm_timeout,
                **properties
            )

If I've understood it correctly, the constructed router which is used for the publish message in amqp.send_task_message only has the queue, which I checked the value of with:

from tasks import add
task_name = add.name
app.amqp.router.route({}, task_name, args=(), kwargs={})
{'queue': <unbound Queue my-celery-queue -> <unbound Exchange celery(direct)> -> celery>}

i.e., the direct exchange queue lives inside the router. However, because of the code in amqp.py I specified above, since the queue is routed to via a direct exchange, it publishes to the default amqp exchange. I think this is causing a bit of an issue, because theres now no way for me to take advantage of a direct exchange's routing keys and attach another key to listen to the same routing key on the direct exchange, say, for logging purposes. Now, I could change the exchange type to topic, sure, but I mainly wanted to understand why this is the default behaviour for direct queues. (And if the idea is we don't want to allow multiple queues to listen to the same message, then why not have the same restriction for topic queues).

I can technically bypass this behaviour currently by specifying a custom task_routes config where I specify the exchange and the routing key, but mainly wanted to understand why the original behaviour is there in the first place

I have checked the commit history for the code here and here but it wasn't fully clear to me why the code was added to behave the way it currently does (also, a minor nit, but the comment seems incorrect in the elif condition, from my understanding we hit that codeblock if exchange is undefined and if it is topic, the comment seems to be a negation of that).

if it is not intentional, I guess the fix would simply be to remove the == 'direct' part of the logic

Suggestions

TLDR version: Why do we route to the default amqp queue for direct default exchanges, not emitting the messages to the direct default exchange? (and why allow the bypass for the topic exchange only if it is intentional? Their main difference is regex matching of routing keys, as per my understanding). If it is expected, could I add this to the documentation + could we update the comment I mentioned.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions