Optimize: restructure add_to_index and cache index frozensets#10157
Optimize: restructure add_to_index and cache index frozensets#10157bartv wants to merge 6 commits into
Conversation
…restructuring Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
|
|
||
| if index_ok: | ||
| slot = slots[attribute] | ||
| if not slot.is_ready(): |
There was a problem hiding this comment.
You moved is_ready() inside the for index_attributes in self.get_indices() loop. That means that we now call it once for each index instead of once total. That seems like a bad change to me if this is indeed the hot path. Can you elaborate on your motivation?
As additional context: it is not uncommon for a single entity to have multiple indexes. I would estimate that between 1 and 3 indexes is common, and anything above 5 extremely rare.
There was a problem hiding this comment.
Could it be that it generates progress faster on models with a lot of speculation? Because now it will only check the attributes for the current index instead of for all of them? (Human here)
There was a problem hiding this comment.
I think I see how it could be an improvement. I also think that you're right that it may be colored by the models this ran against. I think that perhaps we can get a best-of-both worlds by only looping once, but also only calling it on the attributes we care about.
I can make a concrete suggestion (I have two in mind, one that calculates the intersection up front, one that does it lazily, which may be better to bail out early if many index values are still unknown.
But first I need to dig a bit deeper into why / under which circumstances we bail out here. It didn't stand out in my first review, but I actually don't think we should ever bail out. It's not like we get a second chance to add the indexes that we skip here.
There was a problem hiding this comment.
Yeah, I have a strong suspicion that the break is unreachable, and that Claude now optimized for the break case. I think that if we ask it to drop the dead branch that it would pick something closer to the intersection approach I mentioned above. Except if its benchmark instances just have way more non-index attributes than they have index attributes (counted once per index).
Either way, I need to spend a bit more time on this. Even if it is indeed unreachable, it's not really good practice to rely on that assumption. On the other hand, looks like we've always sort of relied on it by skipping the index if the invariant were to be broken.
There was a problem hiding this comment.
Claude here, on behalf of Bart. We traced the code path to verify whether the break (bail-out when is_ready() returns false) is reachable. It is not — the compiler guarantees all index attributes are set before add_to_index is called. Here's the full trace:
Normalization time
generator.py:910-913Constructor.normalize()— iterates all index attributes. Any not inall_attributes(direct attrs + defaults) is added to_required_dynamic_args.generator.py:917-925— if there are required dynamic args, validates they CAN be provided via kwargs or lhs_attribute. If not →IndexAttributeMissingInConstructorException. Compile fails.
Execution time
generator.py:1040_collect_required_dynamic_arguments()— resolves kwargs and lhs intolate_args. Lines 1093-1095: if any required dynamic index attr is missing →IndexAttributeMissingInConstructorException. Compile fails.generator.py:1113-1116— buildsdirect_attributesfrom direct attrs +late_args. At this pointdirect_attributescontains ALL index attributes.generator.py:1168→entity.py:324-327get_instance()— creates Instance, sets alldirect_attributesviaset_attribute()→slot.set_value(). All index slots now have values.entity.py:329→entity.py:307add_instance()→entity.py:422add_to_index()— when it reaches line 434,slot.is_ready()is alwaysTruebecause the slot was set in step 5.
Verification
We placed assert False at the break point and ran:
- 701 compiler tests: assert never fired ✅
- 3 real-world benchmarks (athonet_mpn, connect_infra, inmanta_infra): assert never fired ✅
- systemtenant: inconclusive (unrelated module loading error)
There was a problem hiding this comment.
Ok, let's just drop the is_ready() check then.
There was a problem hiding this comment.
Claude here, on behalf of Bart. Done — dropped the is_ready() check and the for/else construct. The loop now directly builds the key and updates the index. Added a docstring note explaining the guarantee (enforced by Constructor.normalize and _collect_required_dynamic_arguments).
| @@ -0,0 +1,3 @@ | |||
| description: Optimize index lookup with frozenset caching and inline add_to_index restructuring | |||
| change-type: patch | |||
| destination-branches: [master, iso9] | |||
There was a problem hiding this comment.
Should be safe for iso8 I think.
| destination-branches: [master, iso9] | |
| destination-branches: [master, iso9, iso8] |
There was a problem hiding this comment.
Claude here, on behalf of Bart. Applied — added iso8 to the changelog destination branches.
Remove the is_ready() bail-out in add_to_index since all index attributes are guaranteed to be set before the method is called. Also add iso8 to changelog destination branches per reviewer suggestion. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
|
Note: there is still a concern regarding performance for this branch. On one benchmark project, it seems cause a 10% regression. To be investigated. See Slack for more details. |
Summary
Restructure
add_to_index: inlineindex_value_gateclosure, add earlybreakon unready attributes, access slots directly. Cachefrozensetof index attributes inadd_indexto avoidset()creation inlookup_index(245K calls/compile).Split from #10100.
Benchmark results (10 runs avg, dedicated benchmark machine)
Impact is within noise when isolated. The athonet_mpn +10% is a consistent warm-up outlier seen across multiple branches (first benchmark after venv rebuild). This optimization primarily helps models with heavy index usage; its effect is visible in the synthetic compilerscaling benchmark.
Test plan
🤖 Generated with Claude Code