Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Deadlock due to outdated lock_info #37

@xinhaoyuan

Description

@xinhaoyuan

I was testing locks using my testcase.
I believe that there is a bug in the lock_info handling of locks_server and locks_agent, which may cause deadlock.

My testcase has 3 concurrent clients/agents, namely C1, C2, and C3, and 3 locks, [1], [2], and [3].

  • C1 requests locks in the order of [[1], [2], [3]]
  • C2 requests locks in the order of [[2], [3], [1]]
  • C3 requests locks in the order of [[3], [1], [2]]

Here is how the bug happened (in sketch):

  1. C1, C2, and C3 competed on locks.
    Due to the deadlock resolving algorithm, C1, C2 eventually acquired all locks and finished.

  2. In the resolution process, C3 got lock_info of [2] (due to locks_agent:send_indirects/1)
    even C3 hadn't reach the point of requesting it, which means C3 was not in [2]'s queue.

  3. The locks_server remove the local lock_info entry of [2] since the queue is empty now.
    This effectively resets the vsn of the lock_info.

  4. C3 started requesting [2], but the locks_server would respond with lock_info that
    had lower vsn than what C3 was told with. Thus C3 got stuck.

I've tried to fix by not removing lock_info entries in locks_server, but my fix seems to fail the test in other ways. Maybe this breaks the algorithm?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions