Thanks to visit codestin.com
Credit goes to github.com

Skip to content

feat(block-sync): add support for follower mode#5556

Merged
swift1337 merged 6 commits intokrakatoafrom
feat/rpc-mode
Jan 8, 2026
Merged

feat(block-sync): add support for follower mode#5556
swift1337 merged 6 commits intokrakatoafrom
feat/rpc-mode

Conversation

@swift1337
Copy link
Member

@swift1337 swift1337 commented Jan 7, 2026

This PR introduces follower mode for non-validating nodes that do not participate in block production. It addresses block discrepancies between validator and non-validator nodes running over libp2p and/or when block times are small.

config.toml

[blocksync]
version = v0

# Experimental Follower model (bool):
#
# If enabled, the node will perpetually rely on block-sync to catch up.
# This is useful for RPC-only nodes that don't need to participate in consensus.
#
# This will be ignored if the node is a validator.
follower_mode = true

Follower nodes request statuses (min-max available height) every second (regular nodes do this every 10 seconds), keeping pace with the validators.

If the node is a validator, this mode is ignored.

Note that since the consensus reactor is perpetually "waiting", follower nodes:

  • Don't relay consensus messages (can be improved in the following PRs)
  • Don't track cometbft_consensus_* metrics (use cometbft_consensus_blocksync_* instead)

Changes

  • config.toml: add blocksync.follower_mode param
  • implement follower mode in blocksync, adopt it mempool & consensus reactors
  • make most of blocksyncReactor.Receive() logic non-blocking
  • minor blocksync reactor code formatting
  • other minor changes to wire this feature

Closes STACK-2047

@swift1337 swift1337 changed the base branch from main to krakatoa January 7, 2026 21:11
@swift1337 swift1337 self-assigned this Jan 7, 2026
@swift1337 swift1337 changed the title feat(block-sync): follower mode feat(block-sync): add support for follower mode Jan 7, 2026
@swift1337 swift1337 marked this pull request as ready for review January 7, 2026 22:45
@cursor
Copy link

cursor bot commented Jan 7, 2026

You have run out of free Bugbot PR reviews for this billing cycle. This will reset on January 16.

To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

Comment on lines +324 to 336
go r.respondToPeer(msg, e.Src)
case *bcproto.BlockResponse:
go bcR.handlePeerResponse(msg, e.Src)
// adds block to the pool
go r.handlePeerResponse(msg, e.Src)
case *bcproto.StatusRequest:
// Send peer our state.
e.Src.TrySend(p2p.Envelope{
go e.Src.TrySend(p2p.Envelope{
ChannelID: BlocksyncChannel,
Message: &bcproto.StatusResponse{
Height: bcR.store.Height(),
Base: bcR.store.Base(),
Height: r.store.Height(),
Base: r.store.Base(),
},
})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess in libp2p these are already running in goroutines since we have parallel reactor message processing right? Is there a pro/con or reason to run these in goroutines instead of just synchronously?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In libp2p, yes, but not in comet-p2p.

On the other hand, even though lp2p is concurrent, should Receive() wait for another p2p request to be sent (TrySend)? I see the potential downside in spawning more goroutines and creating congestion on the go scheduler, but here we have a reasonable amount of routines imo (one-to-one)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont feel super strongly here but I do think that handling all of these in their own goroutines is a bit of a premature optimization that may have impact on other places where goroutine scheduling is precious like we have seen with rpc requests.

should Receive() wait for another p2p request to be sent (TrySend)

I totally agree that no, receive shouldn't have to wait for try send, or loading a block from the storage, etc when these are already synchronized internally with locks, so no reason to block the shared receive func. But I'm also not sure spawning a goroutine for each request is the best way to avoid that (or that we even should avoid that/optimize, since we haven't really seen this having any impact yet, but maybe you have seem this or some data on this while testing?).

I think pushing these onto an internal queue and having a set amount of workers pulling messages off and processing them would make more sense so we dont have an unbounded amount of goroutines here (I do see its 1-1 with p2p messages, but what if there are a bunch of nodes that are far behind and we are getting spammed with BlockRequest messages, that would potentially be a lot of gorouintes to spin up).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of the PoC, I'd keep things as is and revisit later.

so we dont have an unbounded amount of goroutines here

We can configure lp2p consumer queue per reactor or configure overall lp2p rate limiter for these streams

Based on IB runs, there are not many BLOCKSYNC messages

image

@cometbft cometbft deleted a comment from linear bot Jan 8, 2026
Copy link
Collaborator

@mattac21 mattac21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved with comment on erics comment

Comment on lines +324 to 336
go r.respondToPeer(msg, e.Src)
case *bcproto.BlockResponse:
go bcR.handlePeerResponse(msg, e.Src)
// adds block to the pool
go r.handlePeerResponse(msg, e.Src)
case *bcproto.StatusRequest:
// Send peer our state.
e.Src.TrySend(p2p.Envelope{
go e.Src.TrySend(p2p.Envelope{
ChannelID: BlocksyncChannel,
Message: &bcproto.StatusResponse{
Height: bcR.store.Height(),
Base: bcR.store.Base(),
Height: r.store.Height(),
Base: r.store.Base(),
},
})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont feel super strongly here but I do think that handling all of these in their own goroutines is a bit of a premature optimization that may have impact on other places where goroutine scheduling is precious like we have seen with rpc requests.

should Receive() wait for another p2p request to be sent (TrySend)

I totally agree that no, receive shouldn't have to wait for try send, or loading a block from the storage, etc when these are already synchronized internally with locks, so no reason to block the shared receive func. But I'm also not sure spawning a goroutine for each request is the best way to avoid that (or that we even should avoid that/optimize, since we haven't really seen this having any impact yet, but maybe you have seem this or some data on this while testing?).

I think pushing these onto an internal queue and having a set amount of workers pulling messages off and processing them would make more sense so we dont have an unbounded amount of goroutines here (I do see its 1-1 with p2p messages, but what if there are a bunch of nodes that are far behind and we are getting spammed with BlockRequest messages, that would potentially be a lot of gorouintes to spin up).

@swift1337 swift1337 merged commit 40753e1 into krakatoa Jan 8, 2026
31 of 32 checks passed
@swift1337 swift1337 deleted the feat/rpc-mode branch January 8, 2026 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants