Improvements to tree parsing speed #3508

carlosmn · 2015-11-14T23:46:26Z

Here's a couple of simple changes to the way we parse trees which gives us significant improvements.

The first one is just silly, really. We've already calculated how long the filename is, so we can just skip over it instead of looking for the terminator again. This gives us about half the gains.

Then, we can also very easily avoid constantly asking the system for memory by allocating the entries in a pool owned by the tree. The lifetime of the entries is tied to the tree already, so this is a great place to use a pool. Those entries for which we give ownership to the user don't need to change, as we already perform an extra allocation to give them their own lifetime.

I checked the speedup by parsing the top-level tree for git.git for ~41k commits. The timing is a bit rough, but the speedup ends up being a bit under 1/3, which is not bad, considering. The most expensive thing right now is parsing the filemode number; and using libc's (presumably) optimised one doesn't really help.

	master	this
Debug	4.7s	3.8s
Release	2.7s	2.0 s

I've also tested by grabbing the parent's tree for each commit we walk and diffing it with a pathspec of "README" (I initially tried with a full diff but that takes over two minutes, which is also pretty bad but a different story). The speedup isn't quite as drastic since we're still doing the diff, which I haven't touched here, but still noticeable, especially in release mode.

	master	this
Debug	11.5s	9.5s
Release	6.0s	4.3 s

carlosmn · 2015-11-15T02:30:23Z

Unfortunately the larger benchmark which also uses diff actually slows down, which I can't explain. This needs some investigation into that before accepting it.

vmg · 2015-11-15T17:46:14Z

src/tree.h


 struct git_tree_entry {
 	uint16_t attr;
 	git_oid oid;
+	bool pooled;


This mis-aligns the struct packing, which is not neat for pool allocations. I think we could very easily change filename_len to uint16_t and get a couple extra bytes to play with.

It does indeed bring sadness to alignment, I forgot to double-check after testing it out. It's probably enough to have a uint16_t as length for anything we would actually care to support, tbh (and lol if you actually need the 64 bits for your path length).

I changed to 16-bit and avoided any padding (except for the final one due to the zero-length array). This brings the size down from 32 to 26 bytes. It doesn't seem to make a difference in speed one way or another.

We've already looked at the filename with `memchr()` and then used `strlen()` to allocate the entry. We already know how much we have to advance to get to the object id, so add the filename length instead of looking at each byte again.

These are rather small allocations, so we end up spending a non-trivial amount of time asking the OS for memory. Since these entries are tied to the lifetime of their tree, we can give the tree a pool so we speed up the allocations.

We already know the size due to the `memchr()` so use that information instead of calling `strlen()` on it.

This reduces the size of the struct from 32 to 26 bytes, and leaves a single padding byte at the end of the struct (which comes from the zero-length array).

carlosmn · 2015-11-28T18:52:42Z

Both the tree-read microbench and the larger diff test are sped up, so I'm happy to merge this as-is.

ethomson · 2015-11-29T12:50:34Z

Out of curiosity, why did the larger benchmark slow down in the intermediate changes?

carlosmn · 2015-11-29T20:24:26Z

I'm not sure; and I seem to have misplaced the specific benchmark which showed a slowdown. I was probably doing something dumb, as it was still just loading a commit's tree and diffing with the first parent's tree, which I just saw (expectedly) speed up.

vmg · 2015-11-30T10:47:29Z

I like the struct re-ordering. 👍

ethomson · 2015-11-30T15:55:43Z

src/tree.c

+{
+	git_tree_entry *entry = NULL;
+	size_t tree_len;
+


Can we add a check here to make sure that filename_len fits in a uint16_t? I don't see any obvious vulnerabilities if one were to somehow build a tree that had a longer filename in it, but I'm also not the most creative person in the owrld.

Absolutely, I've pushed up a commit which does this and de-duplicates the size and overflow checks.

Return an error in case the length is too big. Also take this opportunity to have a single allocating function for the size and overflow logic.

Improvements to tree parsing speed

pks-t · 2015-12-01T13:34:39Z

I'd be interested as to how PR #3527 affects your benchmarks. These changes introduced quite a big memory leak and we did not call git__free for a lot of tree entries, so this might affect the outcome of your timings.

carlosmn · 2015-12-01T13:56:28Z

I'm not sure if we dup tree entries during a diff, but I'll definitely check it out when I'm back on my desktop. When I've fixed other leaks in this area the benchmarks did improve, though.

carlosmn · 2015-12-05T13:34:45Z

Without the leak, Debug goes down to 9.3s; in Release it doesn't seem to make a noticeable difference.

vmg reviewed Nov 15, 2015
View reviewed changes

carlosmn added 4 commits November 28, 2015 19:21

tree: pool the entry memory allocations

ed97074

These are rather small allocations, so we end up spending a non-trivial amount of time asking the OS for memory. Since these entries are tied to the lifetime of their tree, we can give the tree a pool so we speed up the allocations.

tree: calculate the filename length once

2580077

We already know the size due to the `memchr()` so use that information instead of calling `strlen()` on it.

tree: make path len uint16_t and avoid holes

ee42bb0

This reduces the size of the struct from 32 to 26 bytes, and leaves a single padding byte at the end of the struct (which comes from the zero-length array).

carlosmn force-pushed the cmn/tree-parse-speed branch from 7a1ec86 to ee42bb0 Compare November 28, 2015 18:24

ethomson reviewed Nov 30, 2015
View reviewed changes

tree: ensure the entry filename fits in 16 bits

95ae352

Return an error in case the length is too big. Also take this opportunity to have a single allocating function for the size and overflow logic.

ethomson added a commit that referenced this pull request Dec 1, 2015

Merge pull request #3508 from libgit2/cmn/tree-parse-speed

337b2b0

Improvements to tree parsing speed

ethomson merged commit 337b2b0 into master Dec 1, 2015

carlosmn deleted the cmn/tree-parse-speed branch December 9, 2015 02:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improvements to tree parsing speed #3508

Improvements to tree parsing speed #3508

Uh oh!

carlosmn commented Nov 14, 2015

Uh oh!

carlosmn commented Nov 15, 2015

Uh oh!

vmg Nov 15, 2015

Uh oh!

carlosmn Nov 15, 2015

Uh oh!

carlosmn Nov 28, 2015

Uh oh!

carlosmn commented Nov 28, 2015

Uh oh!

ethomson commented Nov 29, 2015

Uh oh!

carlosmn commented Nov 29, 2015

Uh oh!

vmg commented Nov 30, 2015

Uh oh!

ethomson Nov 30, 2015

Uh oh!

carlosmn Nov 30, 2015

Uh oh!

pks-t commented Dec 1, 2015

Uh oh!

carlosmn commented Dec 1, 2015

Uh oh!

carlosmn commented Dec 5, 2015

Uh oh!

Uh oh!

Improvements to tree parsing speed #3508

Improvements to tree parsing speed #3508

Uh oh!

Conversation

carlosmn commented Nov 14, 2015

Uh oh!

carlosmn commented Nov 15, 2015

Uh oh!

vmg Nov 15, 2015

Choose a reason for hiding this comment

Uh oh!

carlosmn Nov 15, 2015

Choose a reason for hiding this comment

Uh oh!

carlosmn Nov 28, 2015

Choose a reason for hiding this comment

Uh oh!

carlosmn commented Nov 28, 2015

Uh oh!

ethomson commented Nov 29, 2015

Uh oh!

carlosmn commented Nov 29, 2015

Uh oh!

vmg commented Nov 30, 2015

Uh oh!

ethomson Nov 30, 2015

Choose a reason for hiding this comment

Uh oh!

carlosmn Nov 30, 2015

Choose a reason for hiding this comment

Uh oh!

pks-t commented Dec 1, 2015

Uh oh!

carlosmn commented Dec 1, 2015

Uh oh!

carlosmn commented Dec 5, 2015

Uh oh!

Uh oh!