Tags: openacid/openraft
Tags
fix: a restarted follower should not wait too long to elect. Otherwis… …e the entire cluster hangs
refactor: add defensive check: a PurgedMarker log should never be app… …lied to state machine
fix: RaftCore.entries_cache is inconsistent with storage. removed it. - When leader changes, `entries_cache` is cleared. Thus there may be cached entries wont be applied to state machine. - When applying finished, the applied entries are not removed from the cache. Thus there could be entries being applied more than once.
fix: install snapshot req with offset GE 0 should not start a new ses… …sion. A install-snapshot always ends with a req with data len to be 0 and offset GE 0. If such a req is re-sent, e.g., when timeout, the receiver will try to install a snapshot with empty data, if it just finished the previous install snapshot req(`snapshot_state` is None) and do not reject a install snapshot req with offset GE 0. Which results in a `fatal storage error`, since the storage tries to decode an empty snapshot data. - feature: add config `install_snapshot_timeout`.
refactor: upgrade logging level for important state changing event
change: change-membership should be log driven but not channel driven A membership change involves two steps: the joint config phase and the final config phase. Each phase has a corresponding log invovled. Previously the raft setup several channel to organize this workflow, which makes the logic hard to understand and introduces complexity when restarting or leadership transfered: it needs to re-establish the channels and tasks. According to the gist of raft, all workflow should be log driven. Thus the new approach: - Write two log(the joint and the final) at once it recevies a change-membership request. - All following job is done according to just what log is committed. This simplifies the workflow and makes it more reliable and intuitive to understand. Related changes: - When `change_membership` is called, append 2 logs at once. - Introduce universal response channel type to send back a message when some internal task is done: `ResponseTx`, and a universal response error type: `ResponseError`. - Internal response channel is now an `Option<ResponseTx>`, since the first step of membership change does not need to respond to the caller. - When a new leaser established, if the **last** log is a joint config log, append a final config log to let the partial change-membership be able to complete. And the test is added. - Removed membership related channels. - Refactor: convert several func from async to sync.
change: MembershipConfig.member type is changed form HashSet BTreeSet
PreviousNext