-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
Speedup ChainMap #98766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Previous discussion on OrderedSet from the mailing list. Given that I would guess there would be a heavy preference for adding a method to dict that allows you to reuse hashes without accessing the values directly or calling getitem. Seems like a fairly straightforward addition but I know that is a core class. Just having that alone would make it trival to add an OrderedSet to the collections module and remove any need to do any other weird stuff. |
Could you please provide microbenchmarks which show the speedup? I would be surprised if |
Whoops I don't think I benched marked that one properly when I was coding. It is faster when you have 2 items and slower every other time... I shall remove that on the PR. The speedup listed above was just for iter not the |
I'll look at this more later but my first impression is that the analysis and approach are fundamentally unsound. The whole effort to be "lazy" seems misguided. The payoff for lazy evaluation comes from deferring work until a later time or from possibly not doing all of the work. Neither apply in this case. The ISTM that the benchmarks are only measuring the pure python loop overhead. That really stands out with an 8 level chainmap with only 5 keys. Also, if you're using string keys, since they already cache the hash values, the benchmark is not giving credit to the current code for reusing hashes rather than recomputing them (a Decimal object would not be so fortunate). The C code for One other thought: The ChainMap class was designed to be mostly simple pure python code. It wasn't intended to be a high performance class. Users who care more about performance are almost always better off creating a single flat dictionary. This would still be true even if we rewrote all of ChainMap in C. It's fine to make optimization tweaks if they are readable and not disruptive, but we don't want to "go gonzo" and introduce new dict methods and ordered set type and whatnot. It isn't worth it. |
Try this to see what effect it has on your benchmarks:
Also, we should add tests for both |
I am going to make a fuller test suit to cover all the bases for iter. I have some code right now for checking the hash counts and can make some tests for ensuring that hash reuse happens. Is there any classes to use keys that you suggest outside of decimal and str? I was also going to add test cases for different dict like objects. I can add cases for large and small dictionaries. The majority of the overhead from what I was testing and have tested seems to be from the creation of intermediate dicts that just get consumed right away. That was heavily outweighing the extra hash calls. The second item causing the performance regression from update to the current method is that fromkeys doesnt appear to reuse the hash in cases where update will. I am not sure why that is. The behavior is consistent from 3.9-now. In your opinion are larger dict(key number wise) more common or smaller dicts? Not sure which gives the most representative performance. |
Tuples are commonly used as keys, and they do not cache the hash. Also you can just test with a custom Python object with defined Test also the alternatives proposed in #76973 (comment). |
So this is the most efficient version I was able to construct. Whenever our list of maps is just normal dictionary objects it will fall back to using update. Currently fromkeys does not reuse the hash for subclasses of dict. This produces the same exact number of hashes as the current solution. Whenever the maps are made up of types that cannot reuse the hash it will either bypass the intermediate dicts by zipping the keys or just replace the return dict. This has my original proposal, dict.fromkeys( (iterable of all keys)) as one of its paths along with the original dict.update as the other path for the cases when maps is all non dict and all dicts respectively. Whenever its made up of mixed groups the runtime is always less then the current setup. It does have a worst case behavior when the first map it parses is a dict and every other one above it not a dict because zip becomes more expensive then having just never building the first dict. But it would make the code complex to setup correctly. Every other cases it performs better then my original proposal. The lambda is in here just for clarity but it is a function normally because lambda run slower when used with itertools method then normal functions from my testing. def __iter__(self):
d = {}
for k, g in _groupby(reversed(self.maps), lambda x: type(x) is dict):
if k:
# without using update only dict reuses hash
# if dict subclass can safely call update then we change to isinstance
for mapping in g:
d.update(mapping)
else:
if d:
d.update(zip(_chain(*g),_repeat(None))) #faster then intermediate dict
else:
d = dict.fromkeys(_chain(*g)) #fromkeys is faster when replacing dict
return iter(d) Testing below was done on 3.10 having trouble with my local build of the current version but will add those numbers tomorrow. Testing using a dict of tuples of decimals for hash speed effects Testing using a dict with string keys |
I appreciate the heroic effort but just want to do the simpler edit mentioned in the issue tracker. The code about looks like it is "killing a mosquito with a cannon" by breaking down the various subcases and using a different approach for each. It seems to try every trick in the book to squeeze out a few clock cycles. As a result, the code is much harder to understand and will be more difficult to maintain and test. The timing results are dubious. Adding lambdas, groupby, argument unpacking, and multiple conditionals is likely to slow the common cases. Also the interpreter performance changed quite a bit in 3.11 so the relative performance of the components will be different. The code above has the hallmarks of overfitting to a particular Python version , particular build, and to a particular set of benchmarks. Thank you again, but I really don't want to go down this path. If the performance of |
Uh oh!
There was an error while loading. Please reload this page.
Feature or enhancement
Makes ChainMaps iter, copy, and parents methods lazier.
Pitch
Use itertools to reduce number of objects created. Trying to claw back performance loss due to switching to an order preserving iter method.
While this method is roughly 3 times faster then the current behavior for iter it is still ~5 times slower then sets were originally based on testing high collision chainmaps. In order to get back to the original performance there would either need to be an order preserving set object or a new method added to dict that can copy hashes without copying or accessing the underlying items. The later of which seems much easier to do but the former has more general use cases. I am unsure if any other of the custom dict objects also have suboptimal structures being used that an ordered set would resolve. I believe most of the time it's suggested to just use dict in place of an ordered set.
Previous discussion
https://bugs.python.org/issue32792
The solution I went with was based on the above discussions rejected solution.
#86653
The above change inadvertently added a major performance regression by creating multiple new dicts that caused hash to be called on every object. This removed any benefit to using update over a single iterable based constructor.
No discussion on the list slicing stuff I can find. Was an inadvertent discovery when working on the iter method.
The text was updated successfully, but these errors were encountered: