Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Block stream pipelining #2581

@leoyvens

Description

@leoyvens

The subgraph processing loop is currently entirely sequential. Pipelining is a way to parallelize it. The first target for this is the block stream, so that while a block is processed the next blocks are being scanned for triggers. This could reduce indexing time by up to 50%, though for must subgraphs that number is more like 20%. A few points to consider:

The block stream must be made independent from the subgraph store

Currently the block stream depends on a subgraph_store

subgraph_store: Arc<dyn WritableStore>,

this for for two reasons, first for keeping track of the current block pointer for the subgraph
let subgraph_ptr = ctx.subgraph_store.block_ptr()?;

which can be solved simply by the block stream keeping track of the last block pointer it emitted. Second for updating the synced status on the subgraph
fn update_subgraph_synced_status(&self) -> Result<(), Error> {

this responsibility should be moved to the instance manager. Maybe BlockStreamEvent should gain a Done variant to signal that this should be checked.

The pipelining mechanism

An attractive solution is to implement this as an adapter to the existing block stream. This would be a tokio task that pulls blocks from the block stream and puts it in a channel. Buffering a single range will likely be enough for the optimization to be effective. However, the fact that the block stream scans blocks in batches (ranges) but emits them as individual blocks will require some async cleverness in this adapter. Possibly the block stream itself should emit batches (a stream of Vec<BlockWithTriggers>) and the adapter would expose it as a stream of individual blocks.

Stopwatch metrics

Right now the block stream uses only the "scan_blocks" stopwatch section. We could keep the same name for backwards compatibility with the dashboards, and just make "scan_blocks" mean time spent waiting for the block stream to emit a block.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions