Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[ty] Avoid lookup maps for small place tables#26177

Merged
charliermarsh merged 5 commits into
mainfrom
charlie/codex-compact-place-lookups
Jun 22, 2026
Merged

[ty] Avoid lookup maps for small place tables#26177
charliermarsh merged 5 commits into
mainfrom
charlie/codex-compact-place-lookups

Conversation

@charliermarsh

@charliermarsh charliermarsh commented Jun 20, 2026

Copy link
Copy Markdown
Member

Summary

Avoid retaining reverse-lookup hash tables for completed place tables when linear lookup is cheaper. Builders continue to use the existing hash tables to deduplicate entries; when construction finishes, symbol tables with at most 16 entries and member tables with at most eight entries discard the index and use allocation-free linear lookup. Larger tables retain the existing hashbrown::HashTable.

This supersedes #26156. The thresholds are independently benchmarked because comparing member expressions is more expensive than comparing symbol names. A cutoff of eight avoids that PR's CodSpeed regression for members, while 16 recovers more symbol-table memory without a measurable CPU regression.

Performance

At the final 16/8 cutoffs, the latest CI memory report shows:

Project Total retained memory semantic_index
flake8 -1.73% -6.28%
trio -1.32% -5.46%
sphinx -0.95% -4.70%
prefect -1.10% -4.76%

The initial 8/8 cutoff eliminated #26156's CodSpeed regression: 15 of 23 ty microbenchmarks improved, with the full range between -0.34% and +0.34%, while the project benchmarks ranged from -0.08% to +0.06% (microbenchmarks, projects). Raising only the symbol cutoff from 8 to 16 was classified as No Change across every benchmark: the microbenchmarks ranged from -0.7% to +0.4% and the project benchmarks from -0.2% to 0.0% (microbenchmark comparison, project comparison). That change saves an additional 0.12–0.18% of total retained memory and 0.64–0.91% of semantic_index memory.

Ecosystem distribution

I temporarily instrumented final table construction and post-build lookups across all 162 projects in the ecosystem-analyzer's pinned mypy-primer corpus. The final cutoffs give the two structures similar lookup exposure while avoiding indexes for most tables:

Structure Cutoff Nonempty tables without an index Nonempty linear lookup share Comparisons per linear lookup
Symbols 16 93.55% 38.56% 5.56
Members 8 86.20% 46.50% 2.58

For symbols, increasing the cutoff from 8 to 16 removes another 217,586 indexes across the corpus runs and avoids 29.7 MiB of aggregate map storage while moving 16.64% of lookups to linear search. Applying the same increase to members would avoid only 9.8 MiB while moving another 24.96% of lookups to linear search, raising their nonempty linear lookup share to 71.74%. This supports the asymmetric 16/8 cutoffs.

The aggregate byte totals repeat common dependencies across independent project runs and should be read comparatively, not as whole-process memory totals.

@astral-sh-bot astral-sh-bot Bot added the ty Multi-file analysis & type inference label Jun 20, 2026
@astral-sh-bot

astral-sh-bot Bot commented Jun 20, 2026

Copy link
Copy Markdown

Typing conformance results

No changes detected ✅

Current numbers
The percentage of diagnostics emitted that were expected errors held steady at 94.37%. The percentage of expected errors that received a diagnostic held steady at 89.00%. The number of fully passing files held steady at 94/134.

@astral-sh-bot

astral-sh-bot Bot commented Jun 20, 2026

Copy link
Copy Markdown

Memory usage report

Summary

Project Old New Diff Outcome
flake8 31.36MB 30.83MB -1.68% (539.99kB) ⬇️
trio 78.60MB 77.55MB -1.33% (1.04MB) ⬇️
sphinx 195.80MB 193.94MB -0.95% (1.86MB) ⬇️
prefect 524.25MB 518.49MB -1.10% (5.75MB) ⬇️

Significant changes

Click to expand detailed breakdown

flake8

Name Old New Diff Outcome
semantic_index 8.55MB 8.02MB -6.28% (550.53kB) ⬇️
parsed_module 9.77MB 9.78MB +0.11% (10.54kB) ⬇️

trio

Name Old New Diff Outcome
semantic_index 18.99MB 17.95MB -5.46% (1.04MB) ⬇️
parsed_module 15.04MB 15.04MB -0.05% (7.83kB) ⬇️

sphinx

Name Old New Diff Outcome
semantic_index 39.57MB 37.70MB -4.70% (1.86MB) ⬇️
parsed_module 18.36MB 18.36MB +0.01% (2.04kB) ⬇️

prefect

Name Old New Diff Outcome
semantic_index 120.90MB 115.15MB -4.76% (5.76MB) ⬇️
parsed_module 19.36MB 19.36MB +0.03% (5.09kB) ⬇️

@astral-sh-bot

astral-sh-bot Bot commented Jun 20, 2026

Copy link
Copy Markdown

ecosystem-analyzer results

No diagnostic changes detected ✅

Full report with detailed diff (timing results)

@charliermarsh charliermarsh force-pushed the charlie/codex-compact-place-lookups branch from b42783d to 990313f Compare June 20, 2026 20:45
@charliermarsh charliermarsh changed the title [ty] Use compact lookups for small place tables [ty] Avoid lookup maps for small place tables Jun 20, 2026
@charliermarsh charliermarsh marked this pull request as ready for review June 22, 2026 01:54
@charliermarsh charliermarsh requested a review from a team as a code owner June 22, 2026 01:54
@astral-sh-bot astral-sh-bot Bot requested a review from carljm June 22, 2026 01:54
@charliermarsh charliermarsh added the performance Potential performance improvement label Jun 22, 2026
@charliermarsh charliermarsh removed the request for review from carljm June 22, 2026 01:54
Comment on lines -164 to -166
/// Map from symbol name to its ID.
///
/// Uses a hash table to avoid storing the name twice.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we retain this comment

Comment thread crates/ty_python_core/src/symbol.rs Outdated
Comment on lines +243 to +247
self.map
.find(SymbolTable::hash_name(name), |id| {
self.table.symbols[*id].name == name
})
.copied()

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have our own SymbolReverseTable wrapper to avoid duplicating the find logic? Same for members

@MichaReiser

Copy link
Copy Markdown
Member

I'm still not a 100% sure this is worth it but seems mostly fine

@charliermarsh charliermarsh merged commit 045e2f6 into main Jun 22, 2026
60 checks passed
@charliermarsh charliermarsh deleted the charlie/codex-compact-place-lookups branch June 22, 2026 12:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Potential performance improvement ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants