Codestin Search App

charliermarsh · 2026-06-20T19:16:45Z

Summary

Avoid retaining reverse-lookup hash tables for completed place tables when linear lookup is cheaper. Builders continue to use the existing hash tables to deduplicate entries; when construction finishes, symbol tables with at most 16 entries and member tables with at most eight entries discard the index and use allocation-free linear lookup. Larger tables retain the existing hashbrown::HashTable.

This supersedes #26156. The thresholds are independently benchmarked because comparing member expressions is more expensive than comparing symbol names. A cutoff of eight avoids that PR's CodSpeed regression for members, while 16 recovers more symbol-table memory without a measurable CPU regression.

Performance

At the final 16/8 cutoffs, the latest CI memory report shows:

Project	Total retained memory	`semantic_index`
flake8	-1.73%	-6.28%
trio	-1.32%	-5.46%
sphinx	-0.95%	-4.70%
prefect	-1.10%	-4.76%

The initial 8/8 cutoff eliminated #26156's CodSpeed regression: 15 of 23 ty microbenchmarks improved, with the full range between -0.34% and +0.34%, while the project benchmarks ranged from -0.08% to +0.06% (microbenchmarks, projects). Raising only the symbol cutoff from 8 to 16 was classified as No Change across every benchmark: the microbenchmarks ranged from -0.7% to +0.4% and the project benchmarks from -0.2% to 0.0% (microbenchmark comparison, project comparison). That change saves an additional 0.12–0.18% of total retained memory and 0.64–0.91% of semantic_index memory.

Ecosystem distribution

I temporarily instrumented final table construction and post-build lookups across all 162 projects in the ecosystem-analyzer's pinned mypy-primer corpus. The final cutoffs give the two structures similar lookup exposure while avoiding indexes for most tables:

Structure	Cutoff	Nonempty tables without an index	Nonempty linear lookup share	Comparisons per linear lookup
Symbols	16	93.55%	38.56%	5.56
Members	8	86.20%	46.50%	2.58

For symbols, increasing the cutoff from 8 to 16 removes another 217,586 indexes across the corpus runs and avoids 29.7 MiB of aggregate map storage while moving 16.64% of lookups to linear search. Applying the same increase to members would avoid only 9.8 MiB while moving another 24.96% of lookups to linear search, raising their nonempty linear lookup share to 71.74%. This supports the asymmetric 16/8 cutoffs.

The aggregate byte totals repeat common dependencies across independent project runs and should be read comparatively, not as whole-process memory totals.

astral-sh-bot · 2026-06-20T19:18:25Z

Typing conformance results

No changes detected ✅

Current numbers

The percentage of diagnostics emitted that were expected errors held steady at 94.37%. The percentage of expected errors that received a diagnostic held steady at 89.00%. The number of fully passing files held steady at 94/134.

astral-sh-bot · 2026-06-20T19:19:12Z

Memory usage report

Summary

Project	Old	New	Diff	Outcome
flake8	31.36MB	30.83MB	-1.68% (539.99kB)	⬇️
trio	78.60MB	77.55MB	-1.33% (1.04MB)	⬇️
sphinx	195.80MB	193.94MB	-0.95% (1.86MB)	⬇️
prefect	524.25MB	518.49MB	-1.10% (5.75MB)	⬇️

Significant changes

Click to expand detailed breakdown

flake8

Name	Old	New	Diff	Outcome
`semantic_index`	8.55MB	8.02MB	-6.28% (550.53kB)	⬇️
`parsed_module`	9.77MB	9.78MB	+0.11% (10.54kB)	⬇️

trio

Name	Old	New	Diff	Outcome
`semantic_index`	18.99MB	17.95MB	-5.46% (1.04MB)	⬇️
`parsed_module`	15.04MB	15.04MB	-0.05% (7.83kB)	⬇️

sphinx

Name	Old	New	Diff	Outcome
`semantic_index`	39.57MB	37.70MB	-4.70% (1.86MB)	⬇️
`parsed_module`	18.36MB	18.36MB	+0.01% (2.04kB)	⬇️

prefect

Name	Old	New	Diff	Outcome
`semantic_index`	120.90MB	115.15MB	-4.76% (5.76MB)	⬇️
`parsed_module`	19.36MB	19.36MB	+0.03% (5.09kB)	⬇️

astral-sh-bot · 2026-06-20T19:20:54Z

`ecosystem-analyzer` results

No diagnostic changes detected ✅

Full report with detailed diff (timing results)

MichaReiser · 2026-06-22T06:24:42Z

-    /// Map from symbol name to its ID.
-    ///
-    /// Uses a hash table to avoid storing the name twice.


Can we retain this comment

MichaReiser · 2026-06-22T06:29:20Z

+        self.map
+            .find(SymbolTable::hash_name(name), |id| {
+                self.table.symbols[*id].name == name
+            })
+            .copied()


Should we have our own SymbolReverseTable wrapper to avoid duplicating the find logic? Same for members

MichaReiser · 2026-06-22T06:30:34Z

I'm still not a 100% sure this is worth it but seems mostly fine

[ty] Avoid lookup maps for small place tables

c815b4f

astral-sh-bot Bot added the ty Multi-file analysis & type inference label Jun 20, 2026

Reduce linear lookup threshold

990313f

charliermarsh force-pushed the charlie/codex-compact-place-lookups branch from b42783d to 990313f Compare June 20, 2026 20:45

charliermarsh changed the title ~~[ty] Use compact lookups for small place tables~~ [ty] Avoid lookup maps for small place tables Jun 20, 2026

charliermarsh added 2 commits June 22, 2026 01:10

Use separate lookup thresholds

2954ad6

Document linear lookup thresholds

44b6e47

charliermarsh marked this pull request as ready for review June 22, 2026 01:54

charliermarsh requested a review from a team as a code owner June 22, 2026 01:54

charliermarsh requested a review from MichaReiser June 22, 2026 01:54

astral-sh-bot Bot requested a review from carljm June 22, 2026 01:54

charliermarsh added the performance Potential performance improvement label Jun 22, 2026

charliermarsh removed the request for review from carljm June 22, 2026 01:54

MichaReiser approved these changes Jun 22, 2026

View reviewed changes

Encapsulate place table reverse lookups

1b825e5

charliermarsh merged commit 045e2f6 into main Jun 22, 2026
60 checks passed

charliermarsh deleted the charlie/codex-compact-place-lookups branch June 22, 2026 12:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ty] Avoid lookup maps for small place tables#26177

[ty] Avoid lookup maps for small place tables#26177
charliermarsh merged 5 commits into
mainfrom
charlie/codex-compact-place-lookups

charliermarsh commented Jun 20, 2026 •

edited

Loading

Uh oh!

astral-sh-bot Bot commented Jun 20, 2026 •

edited

Loading

Uh oh!

astral-sh-bot Bot commented Jun 20, 2026 •

edited

Loading

flake8

trio

sphinx

prefect

Uh oh!

astral-sh-bot Bot commented Jun 20, 2026 •

edited

Loading

Uh oh!

MichaReiser Jun 22, 2026

Uh oh!

MichaReiser Jun 22, 2026

Uh oh!

MichaReiser commented Jun 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

charliermarsh commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Performance

Ecosystem distribution

Uh oh!

astral-sh-bot Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Typing conformance results

No changes detected ✅

Uh oh!

astral-sh-bot Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Memory usage report

Summary

Significant changes

flake8

trio

sphinx

prefect

Uh oh!

astral-sh-bot Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ecosystem-analyzer results

Uh oh!

MichaReiser Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

MichaReiser Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

MichaReiser commented Jun 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

charliermarsh commented Jun 20, 2026 •

edited

Loading

astral-sh-bot Bot commented Jun 20, 2026 •

edited

Loading

astral-sh-bot Bot commented Jun 20, 2026 •

edited

Loading

astral-sh-bot Bot commented Jun 20, 2026 •

edited

Loading

`ecosystem-analyzer` results