Thanks to visit codestin.com
Credit goes to github.com

Skip to content

RQ scheduler can sometimes duplicate jobs #2338

@JimNero009

Description

@JimNero009

This is essentially a carry on of #1354, in which the issue was mitigated but not truly solved.

Specifically, when you have multiple instances working on multiple queues, the problem described in the previous issue can still happen if you are simply unlucky with the timing.

Instead of simply bumping the amount of time a lock is held for, would a better solution here be for some actual 'ownership' of the lock to be conducted? Right now, the scheduler simply writes a pid into the cache value, but it could instead write e.g. a random, unique value that is associated at startup/construction time. Then when it comes to heartbeating the locks, it can sanity check that it itself actually does hold the ownership for the lock value and degrade gracefully if this assumption breaks down?

I've not thought it through in all that much detail, but I do think the current implementation is not solid enough for high-scale production workflows and I'd like to work through something with you to tighten it up.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions