Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Improvements to tree parsing speed #3508

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Dec 1, 2015
Merged

Improvements to tree parsing speed #3508

merged 5 commits into from
Dec 1, 2015

Conversation

carlosmn
Copy link
Member

Here's a couple of simple changes to the way we parse trees which gives us significant improvements.

The first one is just silly, really. We've already calculated how long the filename is, so we can just skip over it instead of looking for the terminator again. This gives us about half the gains.

Then, we can also very easily avoid constantly asking the system for memory by allocating the entries in a pool owned by the tree. The lifetime of the entries is tied to the tree already, so this is a great place to use a pool. Those entries for which we give ownership to the user don't need to change, as we already perform an extra allocation to give them their own lifetime.

I checked the speedup by parsing the top-level tree for git.git for ~41k commits. The timing is a bit rough, but the speedup ends up being a bit under 1/3, which is not bad, considering. The most expensive thing right now is parsing the filemode number; and using libc's (presumably) optimised one doesn't really help.

master this
Debug 4.7s 3.8s
Release 2.7s 2.0 s

I've also tested by grabbing the parent's tree for each commit we walk and diffing it with a pathspec of "README" (I initially tried with a full diff but that takes over two minutes, which is also pretty bad but a different story). The speedup isn't quite as drastic since we're still doing the diff, which I haven't touched here, but still noticeable, especially in release mode.

master this
Debug 11.5s 9.5s
Release 6.0s 4.3 s

@carlosmn
Copy link
Member Author

Unfortunately the larger benchmark which also uses diff actually slows down, which I can't explain. This needs some investigation into that before accepting it.


struct git_tree_entry {
uint16_t attr;
git_oid oid;
bool pooled;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mis-aligns the struct packing, which is not neat for pool allocations. I think we could very easily change filename_len to uint16_t and get a couple extra bytes to play with.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does indeed bring sadness to alignment, I forgot to double-check after testing it out. It's probably enough to have a uint16_t as length for anything we would actually care to support, tbh (and lol if you actually need the 64 bits for your path length).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed to 16-bit and avoided any padding (except for the final one due to the zero-length array). This brings the size down from 32 to 26 bytes. It doesn't seem to make a difference in speed one way or another.

We've already looked at the filename with `memchr()` and then used
`strlen()` to allocate the entry. We already know how much we have to
advance to get to the object id, so add the filename length instead of
looking at each byte again.
These are rather small allocations, so we end up spending a non-trivial
amount of time asking the OS for memory. Since these entries are tied to
the lifetime of their tree, we can give the tree a pool so we speed up
the allocations.
We already know the size due to the `memchr()` so use that information
instead of calling `strlen()` on it.
This reduces the size of the struct from 32 to 26 bytes, and leaves a
single padding byte at the end of the struct (which comes from the
zero-length array).
@carlosmn carlosmn force-pushed the cmn/tree-parse-speed branch from 7a1ec86 to ee42bb0 Compare November 28, 2015 18:24
@carlosmn
Copy link
Member Author

Both the tree-read microbench and the larger diff test are sped up, so I'm happy to merge this as-is.

@ethomson
Copy link
Member

Out of curiosity, why did the larger benchmark slow down in the intermediate changes?

@carlosmn
Copy link
Member Author

I'm not sure; and I seem to have misplaced the specific benchmark which showed a slowdown. I was probably doing something dumb, as it was still just loading a commit's tree and diffing with the first parent's tree, which I just saw (expectedly) speed up.

@vmg
Copy link
Member

vmg commented Nov 30, 2015

I like the struct re-ordering. 👍

{
git_tree_entry *entry = NULL;
size_t tree_len;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a check here to make sure that filename_len fits in a uint16_t? I don't see any obvious vulnerabilities if one were to somehow build a tree that had a longer filename in it, but I'm also not the most creative person in the owrld.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, I've pushed up a commit which does this and de-duplicates the size and overflow checks.

Return an error in case the length is too big. Also take this
opportunity to have a single allocating function for the size and
overflow logic.
ethomson added a commit that referenced this pull request Dec 1, 2015
Improvements to tree parsing speed
@ethomson ethomson merged commit 337b2b0 into master Dec 1, 2015
@pks-t
Copy link
Member

pks-t commented Dec 1, 2015

I'd be interested as to how PR #3527 affects your benchmarks. These changes introduced quite a big memory leak and we did not call git__free for a lot of tree entries, so this might affect the outcome of your timings.

@carlosmn
Copy link
Member Author

carlosmn commented Dec 1, 2015

I'm not sure if we dup tree entries during a diff, but I'll definitely check it out when I'm back on my desktop. When I've fixed other leaks in this area the benchmarks did improve, though.

@carlosmn
Copy link
Member Author

carlosmn commented Dec 5, 2015

Without the leak, Debug goes down to 9.3s; in Release it doesn't seem to make a noticeable difference.

@carlosmn carlosmn deleted the cmn/tree-parse-speed branch December 9, 2015 02:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants