Fix binary I/O for graphs with deleted nodes #1385

Schwarf · 2026-01-10T20:43:46Z

Background

The NetworKit binary graph format (.nkb) is positional and node-ID based:
the reader decodes one entry per node ID u ∈ [0, header.nodes).
However, the writer previously serialized data only for existing nodes, which caused stream misalignment when graphs contained deleted nodes (non-continuous IDs). This could result in assertions and crashes when reading such files (see #1278).

Summary of changes

Use upperNodeIdBound() instead of numberOfNodes()
numberOfNodes() counts only existing nodes, but the binary format must cover the entire node-ID space, including holes. Using upperNodeIdBound() ensures writer and reader agree on the node range.
Iterate over node IDs instead of G.forNodes(...)
G.forNodes(...) skips deleted node IDs.
All writer loops that emit per-node positional data must therefore iterate over node IDs and explicitly encode deleted nodes as zero-degree entries. This applies to:
- base node flags
- adjacency lists
- transpose lists
- weights
- edge IDs
Explicitly handle deleted nodes during chunk size computation
Even deleted nodes must contribute a degree entry (0) so that offset tables match the actual byte layout.
Without this, adjacency data is decoded at incorrect positions.
Fix inverted DELETED_BIT semantics
The previous implementation had inverted semantics for DELETED_BIT in both writer and reader:
- the writer set DELETED_BIT for existing nodes
- the reader removed nodes when DELETED_BIT was not set
  These two inversions accidentally canceled out for dense graphs, masking the bug. This PR fixes the semantics to match the
  name and intended meaning:
- the writer sets DELETED_BIT only for deleted nodes
- the reader removes nodes when DELETED_BIT is set

Result

Existing tests pass, including testWriteReadNonContinuous and testWriteReadNonContinuousDirected mentioned in NetworkitBinaryReader/Writer + deleted nodes + deleted edges #1278.

Fixes #1278

coveralls · 2026-01-10T21:32:24Z

Pull Request Test Coverage Report for Build 21291359119

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

39 of 80 (48.75%) changed or added relevant lines in 2 files are covered.
2 unchanged lines in 1 file lost coverage.
Overall coverage decreased (-0.04%) to 79.373%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
networkit/cpp/io/NetworkitBinaryReader.cpp	12	53	22.64%

Files with Coverage Reduction	New Missed Lines	%
networkit/flow.pyx	2	95.12%

Totals
Change from base Build 20781110983:	-0.04%
Covered Lines:	29538
Relevant Lines:	37214

💛 - Coveralls

Schwarf · 2026-01-11T11:47:20Z

Hi @fabratu,

I’m a bit puzzled by the CPython 3.13 failures. They don’t seem related to this PR (CPython 3.10 builds pass).

The failure is in testEigenvectorsReverse, which checks a specific eigenvector entry. Since eigenvectors are only defined up to a scalar, this seems brittle and likely sensitive to SciPy / BLAS / compiler differences.

I suspect this is an environment issue rather than a regression from this PR. Happy to adjust the test (e.g. residual check) or follow your preferred approach..

fabratu · 2026-01-13T13:35:17Z

You are right, this is an environmental regression (also happening on master). It appears that the newly released SciPy 1.17.0 computes the eigenvalue / eigenvector to zero for our test matrix. Hence, the result is deviating too much from the expected result.

I have not yet looked into the respective code; for 1.16.2 (and below), we get the correct answer. I have filed a bug report and will open a PR for a temp. pin to <1.17.0 for SciPy.

Will also add a review shortly.

Schwarf added 4 commits January 10, 2026 17:27

Add failing test cases.

ecd74e3

Apply fixes in reader and writer.

90b092a

Fix deleted-node flag handling in binary reader

11e0f5a

Fix format.

f106fa7

Schwarf changed the title ~~Fix binary I/O for graphs with deleted nodes (sparse node IDs)~~ Fix binary I/O for graphs with deleted nodes Jan 10, 2026

Schwarf force-pushed the fix/binary_reader_writer_deleted_nodes branch 2 times, most recently from 73439b2 to b06f7da Compare January 11, 2026 10:12

Extend tests.

e1c5cde

Schwarf force-pushed the fix/binary_reader_writer_deleted_nodes branch from b06f7da to e1c5cde Compare January 11, 2026 10:46

fabratu added the bug label Jan 13, 2026

fabratu self-assigned this Jan 13, 2026

Retrigger CI.

7539f16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix binary I/O for graphs with deleted nodes #1385

Fix binary I/O for graphs with deleted nodes #1385

Uh oh!

Schwarf commented Jan 10, 2026 •

edited

Loading

Uh oh!

coveralls commented Jan 10, 2026 •

edited

Loading

Uh oh!

Schwarf commented Jan 11, 2026 •

edited

Loading

Uh oh!

fabratu commented Jan 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix binary I/O for graphs with deleted nodes #1385

Are you sure you want to change the base?

Fix binary I/O for graphs with deleted nodes #1385

Uh oh!

Conversation

Schwarf commented Jan 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

Summary of changes

Result

Uh oh!

coveralls commented Jan 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Test Coverage Report for Build 21291359119

Warning: This coverage report may be inaccurate.

Details

💛 - Coveralls

Uh oh!

Schwarf commented Jan 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fabratu commented Jan 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Schwarf commented Jan 10, 2026 •

edited

Loading

coveralls commented Jan 10, 2026 •

edited

Loading

Schwarf commented Jan 11, 2026 •

edited

Loading