fix: make cluster produce cluster_state:ok on bootstrap#339
Merged
Conversation
three bugs caused `make cluster` to show `cluster_state:fail`, `cluster_slots_assigned:5461`, and `cluster_known_nodes:5`. fix a: after creating the bootstrap ClusterState, populate gossip.local_slots with all 16384 owned slots. previously it was left empty, so every Welcome reply advertised zero slots to joining nodes, which then gossipped back a SlotsChanged(node1, []) event that wiped the canonical slot assignment. fix b: insert the placeholder node into state.nodes *before* sending the UDP join packet. the gossip receive task can process the Welcome reply before cluster_meet resumes, leaving both a real entry and a stale placeholder in state.nodes (and therefore cluster_known_nodes:5 instead of 3). fix c: set local_node.config_epoch = 1 inside single_node() so that cluster_my_epoch matches cluster_current_epoch on the bootstrap node. defensive: skip external SlotsChanged events for the local node — the node is authoritative for its own slots and should never accept gossip overrides of its own ownership. script: remove the redundant addslotsrange 0 5460 on node-1 (bootstrap already owns those slots), and bump the convergence sleep from 0.5s to 1.5s so gossip has time to propagate before cluster info is printed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
summary
make clusterwas showingcluster_state:fail,cluster_slots_assigned:5461, andcluster_known_nodes:5instead of the expected healthy state. three bugs were responsible.bug a — bootstrap node never populates
gossip.local_slotsGossipEngineinitializeslocal_slots = []. When the bootstrap flag is set,ClusterState::single_node()assigns all 16384 slots intostate.slot_map, but this was never reflected ingossip.local_slots. EveryWelcomereply sentslots: []to joining nodes, which then queued aSlotsChanged(node1, [])update. When that looped back to node 1, the handler cleared all 16384 slots fromslot_map. The subsequentaddslotsrange 0 5460in the script then "succeeded" (slots were free) and left only 5461 assigned.fix: populate
gossip.local_slotsfrom the bootstrapClusterStateimmediately after construction, before the engine is placed behind aMutex.bug b — race condition in
cluster_meetleaves stale placeholderscluster_meetwas: (1) build gossip message, (2) send UDP, (3) insert placeholder. The gossip receive task can process theWelcomereply between steps 2 and 3, inserting the real node. Then step 3 adds the placeholder under a fake ID — leaving both instate.nodes. TwoCLUSTER MEETcalls → 5 entries →cluster_known_nodes:5.fix: insert the placeholder before sending the UDP packet. The gossip lock is already released at that point so there is no deadlock.
bug c — bootstrap node's
config_epochmismatchClusterState::single_node()setconfig_epoch: 1on the state but theClusterNodeitself defaulted toconfig_epoch: 0, causingcluster_my_epoch:0vscluster_current_epoch:1.fix: set
local_node.config_epoch = 1insidesingle_node().defensive guard: skip any external
SlotsChangedevent that refers to the local node — the node is authoritative for its own slot ownership.script: removed the redundant
addslotsrange 0 5460on node-1 (bootstrap already owns those slots) and bumped the convergence sleep from 0.5s to 1.5s.what was tested
cargo check -p ember-server -p ember-clusterpasses cleanlymake clustershould now showcluster_state:ok,cluster_slots_assigned:16384,cluster_known_nodes:3design considerations
fixes a and b are the load-bearing changes. fix a eliminates the gossip feedback loop that was destroying the canonical slot assignment. fix b closes a genuine TOCTOU race that's hard to reproduce deterministically but occurs consistently under normal async scheduling. the defensive guard is belt-and-suspenders but has no cost on the non-bootstrap path.