Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: rabbitmq/rabbitmq-server
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: main
Choose a base ref
...
head repository: rabbitmq/rabbitmq-server
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: qq-low-memory
Choose a head ref
  • 1 commit
  • 1 file changed
  • 1 contributor

Commits on Jun 15, 2025

  1. PoC: Decrease per message memory usage in QQs

    The idea is to use a binary instead of lists to to hold messages
    in a quorum queue's incoming queue.
    
    Enqueue = appending to a binary
    Dequeue = matching the front of a binary
    
    Per message memory usage prior to this commit:
    ```
    erts_debug:size([[1|2]]).
    4
    erts_debug:size([[1|2], [3|4]]).
    8
    ```
    4 words * 8 bytes (on a 64 bit machine) = 32 bytes
    
    Per message memory usage after this commit:
    7 bytes to encode the RaIdx + 4 bytes to encode the message size = 11 bytes
    
    The message bytes encoding can be further optimised and be
    below 3 bytes most of the time.
    Hence, let's say 10 bytes per message.
    
    Let's assume we have a 3 node cluster with 500 quorum queues and
    200,000 messages each per queue. That's 100 million messages across
    all queues.
    
    Prior to this commit this would require a total of 9.6 GB of memory:
    100,000,000 msgs * 32 bytes * 3 nodes = 9.6 GB
    
    After this commit this requires only a total of only 3 GB of memory:
    100,000,000 msgs * 10 bytes * 3 nodes = 3 GB of memory
    
    If there is a message TTL policy set for these queues the savings will
    be even more significant because prior to this commit the per message
    memory overhead is 6 words * 8 bytes = 48 bytes:
    ```
    erts_debug:size([[1|[2|5000]]]).
    6
    erts_debug:size([[1|[2|5000]], [3|[4|5000]]]).
    12
    ```
    
    Prior to this commit this would require a total of 14.4 GB of memory:
    100,000,000 msgs * 48 bytes * 3 nodes = 14.4 GB
    
    If we assume addional 6 bytes to encode the expiration timestamp in milliseconds,
    with a binary encoding, this would require a total of only 4.8 GB of memory:
    100,000,000 msgs * 16 bytes * 3 nodes = 4.8 GB
    
    The problem with the binary approach is that appending to a binary at a
    rate of 100,000 msgs/s is catastrophically slow:
    ```
    java -jar target/perf-test.jar -qq -u qq1 -x 1 -y 0 -C 1000000
    ```
    * sends at 90766 msg/s pior to this commit
    * sends at 3435 msg/s after this commit
    After this commit, >80% of CPU time is spent in function
    `__memmove_evex_unaligned_erms` copying binary data around.
    Appending frequently to the binary becomes even slower as the binary
    gets longer.
    
    The only practical solution would be a hybrid approach:
    * lists are used by default
    * lists will always be used for enqueueing messages (i.e. prepending to the list)
    * a binary will be created whenever many messages, e.g. 100,000 new messages got
      accumulated. Such a binary creation will be relatively fast, as
      explained in https://www.erlang.org/doc/system/binaryhandling.html :
      "Appending data to a binary as in the example is efficient because it
      is specially optimized by the runtime system to avoid copying the Acc
      binary every time."
    * Once the binary got created, pattern matching at the front and therefore
      dequeueing messages will be fast.
    * The binaries themselves will be stored in lists (a list with 10
      binaries means ~1 million messages in total in our example).
    
    It's questionable though whether such an approach is worth the introduced complexities
    given that queues are meant to be kept short anyways.
    ansd committed Jun 15, 2025
    Configuration menu
    Copy the full SHA
    a934c1d View commit details
    Browse the repository at this point in the history
Loading