Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@BubbleCal
Copy link
Contributor

create_hnsw_sq(100000x128)
                        time:   [7.1499 s 7.1644 s 7.1840 s]
                        change: [-1.1794% -0.9172% -0.6107%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 2 outliers among 10 measurements (20.00%)
  1 (10.00%) high mild
  1 (10.00%) high severe

search_hnsw_sq100000x128
                        time:   [253.49 µs 253.87 µs 254.24 µs]
                        change: [-3.6161% -3.4660% -3.3038%] (p = 0.00 < 0.05)
                        Performance has improved.

@github-actions
Copy link
Contributor

Code Review

P1: Inconsistent optimization in distance_all() method

The PR optimizes distance() by pre-computing the scale factor and using direct multiplication (dist * self.scale), but the distance_all() method at lines 446-468 (in the new file) still uses inverse_scalar_dist() which allocates a Vec and recomputes the scale on every call.

For consistency and to get the full performance benefit, distance_all() should also use the pre-computed self.scale field:

fn distance_all(&self, _k_hint: usize) -> Vec<f32> {
    match self.storage.distance_type {
        DistanceType::L2 | DistanceType::Cosine => self
            .storage
            .chunks
            .iter()
            .flat_map(|c| {
                c.sq_codes
                    .values()
                    .chunks_exact(c.dim())
                    .map(|sq_codes| l2_distance_uint_scalar(sq_codes, &self.query_sq_code) * self.scale)
            })
            .collect(),
        DistanceType::Dot => self
            .storage
            .chunks
            .iter()
            .flat_map(|c| {
                c.sq_codes
                    .values()
                    .chunks_exact(c.dim())
                    .map(|sq_codes| dot_distance(sq_codes, &self.query_sq_code) * self.scale)
            })
            .collect(),
        _ => panic!("We should not reach here: sq distance can only be L2 or Dot"),
    }
}

This would:

  1. Eliminate the per-call scale computation in inverse_scalar_dist
  2. Avoid the intermediate Vec allocation from collect_vec() inside inverse_scalar_dist
  3. Maintain consistency between the two distance calculation paths

Otherwise the changes look good - the dot_u8_scalar optimization with unrolling and the pre-computed scale factor are solid performance improvements.

@codecov
Copy link

codecov bot commented Dec 30, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Signed-off-by: BubbleCal <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants