RQ scheduler can sometimes duplicate jobs

This is essentially a carry on of https://github.com/rq/rq/issues/1354, in which the issue was mitigated but not truly solved.

Specifically, when you have multiple instances working on multiple queues, the problem described in the previous issue can still happen if you are simply unlucky with the timing.

Instead of simply bumping the amount of time a lock is held for, would a better solution here be for some actual 'ownership' of the lock to be conducted? Right now, the scheduler simply writes a pid into the cache value, but it could instead write e.g. a random, unique value that is associated at startup/construction time. Then when it comes to heartbeating the locks, it can sanity check that it itself actually does hold the ownership for the lock value and degrade gracefully if this assumption breaks down?

I've not thought it through in all that much detail, but I do think the current implementation is not solid enough for high-scale production workflows and I'd like to work through something with you to tighten it up.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

RQ scheduler can sometimes duplicate jobs #2338

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

RQ scheduler can sometimes duplicate jobs #2338

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions