Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Optimize: restructure add_to_index and cache index frozensets#10157

Open
bartv wants to merge 6 commits into
masterfrom
compiler-perf-index-lookup
Open

Optimize: restructure add_to_index and cache index frozensets#10157
bartv wants to merge 6 commits into
masterfrom
compiler-perf-index-lookup

Conversation

@bartv

@bartv bartv commented Mar 19, 2026

Copy link
Copy Markdown
Contributor

Summary

Restructure add_to_index: inline index_value_gate closure, add early break on unready attributes, access slots directly. Cache frozenset of index attributes in add_index to avoid set() creation in lookup_index (245K calls/compile).

Split from #10100.

Benchmark results (10 runs avg, dedicated benchmark machine)

Benchmark master with opt Delta
athonet_mpn 14.32s 15.75s +10.0%
connect_demo 7.75s 7.88s +1.7%
connect_infra 15.74s 15.64s -0.6%
inmanta_infra 13.97s 13.98s +0.1%
systemtenant 10.95s 10.90s -0.5%
Total 62.73s 64.15s +2.3%

Impact is within noise when isolated. The athonet_mpn +10% is a consistent warm-up outlier seen across multiple branches (first benchmark after venv rebuild). This optimization primarily helps models with heavy index usage; its effect is visible in the synthetic compilerscaling benchmark.

Test plan

  • CI passes

🤖 Generated with Claude Code

…restructuring

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@bartv bartv added the compiler label Mar 19, 2026
@bartv bartv self-assigned this Mar 19, 2026
@bartv bartv requested a review from sanderr March 19, 2026 11:28
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Comment thread src/inmanta/ast/entity.py Outdated

if index_ok:
slot = slots[attribute]
if not slot.is_ready():

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You moved is_ready() inside the for index_attributes in self.get_indices() loop. That means that we now call it once for each index instead of once total. That seems like a bad change to me if this is indeed the hot path. Can you elaborate on your motivation?
As additional context: it is not uncommon for a single entity to have multiple indexes. I would estimate that between 1 and 3 indexes is common, and anything above 5 extremely rare.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be that it generates progress faster on models with a lot of speculation? Because now it will only check the attributes for the current index instead of for all of them? (Human here)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I see how it could be an improvement. I also think that you're right that it may be colored by the models this ran against. I think that perhaps we can get a best-of-both worlds by only looping once, but also only calling it on the attributes we care about.

I can make a concrete suggestion (I have two in mind, one that calculates the intersection up front, one that does it lazily, which may be better to bail out early if many index values are still unknown.

But first I need to dig a bit deeper into why / under which circumstances we bail out here. It didn't stand out in my first review, but I actually don't think we should ever bail out. It's not like we get a second chance to add the indexes that we skip here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I have a strong suspicion that the break is unreachable, and that Claude now optimized for the break case. I think that if we ask it to drop the dead branch that it would pick something closer to the intersection approach I mentioned above. Except if its benchmark instances just have way more non-index attributes than they have index attributes (counted once per index).

Either way, I need to spend a bit more time on this. Even if it is indeed unreachable, it's not really good practice to rely on that assumption. On the other hand, looks like we've always sort of relied on it by skipping the index if the invariant were to be broken.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude here, on behalf of Bart. We traced the code path to verify whether the break (bail-out when is_ready() returns false) is reachable. It is not — the compiler guarantees all index attributes are set before add_to_index is called. Here's the full trace:

Normalization time

  1. generator.py:910-913 Constructor.normalize() — iterates all index attributes. Any not in all_attributes (direct attrs + defaults) is added to _required_dynamic_args.
  2. generator.py:917-925 — if there are required dynamic args, validates they CAN be provided via kwargs or lhs_attribute. If not → IndexAttributeMissingInConstructorException. Compile fails.

Execution time

  1. generator.py:1040 _collect_required_dynamic_arguments() — resolves kwargs and lhs into late_args. Lines 1093-1095: if any required dynamic index attr is missing → IndexAttributeMissingInConstructorException. Compile fails.
  2. generator.py:1113-1116 — builds direct_attributes from direct attrs + late_args. At this point direct_attributes contains ALL index attributes.
  3. generator.py:1168entity.py:324-327 get_instance() — creates Instance, sets all direct_attributes via set_attribute()slot.set_value(). All index slots now have values.
  4. entity.py:329entity.py:307 add_instance()entity.py:422 add_to_index() — when it reaches line 434, slot.is_ready() is always True because the slot was set in step 5.

Verification

We placed assert False at the break point and ran:

  • 701 compiler tests: assert never fired ✅
  • 3 real-world benchmarks (athonet_mpn, connect_infra, inmanta_infra): assert never fired ✅
  • systemtenant: inconclusive (unrelated module loading error)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's just drop the is_ready() check then.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude here, on behalf of Bart. Done — dropped the is_ready() check and the for/else construct. The loop now directly builds the key and updates the index. Added a docstring note explaining the guarantee (enforced by Constructor.normalize and _collect_required_dynamic_arguments).

@@ -0,0 +1,3 @@
description: Optimize index lookup with frozenset caching and inline add_to_index restructuring
change-type: patch
destination-branches: [master, iso9]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be safe for iso8 I think.

Suggested change
destination-branches: [master, iso9]
destination-branches: [master, iso9, iso8]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude here, on behalf of Bart. Applied — added iso8 to the changelog destination branches.

bartv and others added 2 commits March 27, 2026 12:00
Remove the is_ready() bail-out in add_to_index since all index attributes
are guaranteed to be set before the method is called. Also add iso8 to
changelog destination branches per reviewer suggestion.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@bartv bartv requested a review from sanderr March 30, 2026 09:29
@sanderr

sanderr commented Mar 30, 2026

Copy link
Copy Markdown
Contributor

Note: there is still a concern regarding performance for this branch. On one benchmark project, it seems cause a 10% regression. To be investigated. See Slack for more details.

Comment thread src/inmanta/ast/entity.py Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants