-
Notifications
You must be signed in to change notification settings - Fork 7.5k
Description
Description
If use_stale and cache_lock used together, if the worker process doing the update crashes, the flag never resets from updating (and c->node->count
never decreases`) and we will get stuck in UPDATING state forever.
to recreate the issue:
- setup an upstream with a long thread sleep in it. (default disable the sleep)
- send a request and populate the cache
- enable the sleep
- send a request and while request is in flight crash the worker that is responsible for this request
- now every time we send a request we get
UPDATING
cache status and it never changes.
i've come up with these two solutions that may fix the issue:
-
Cache: fixed cache stuck UPDATING if worker crashes #869
this fixes the crash scenario but problem remains when we reload nginx (because we can't distinguish between reload and crash, in both scenarios we decrement c->node->count) -
keeping tracking of which cache entries we are updating in every worker
we store a doubly linked list of which caches we are updating in this worker then we setup a signal handler and when we receive a SIGSEV, we iterate through this list and reset the updating flag and count.
seconds solution works in both reload and a crash but is a bit complex.