Thanks to visit codestin.com
Credit goes to github.com

Skip to content

merge: Use git_index__fill to populate the index #3549

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 16, 2015
Merged

Conversation

vmg
Copy link
Member

@vmg vmg commented Dec 16, 2015

See the commit message for details. This fixes a performance issue when merging truly large trees (roughly 1.5 million entries).

cc @carlosmn @ethomson (HAH how's that for a sabbatical)
cc @piki @simonsj

Instead of calling `git_index_add` in a loop, use the new
`git_index_fill` internal API to fill the index with the initial staged
entries.

The new `fill` helper assumes that all the entries will be unique and
valid, so it can append them at the end of the entries vector and only
sort it once at the end. It performs no validation checks.

This prevents the quadratic behavior caused by having to sort the
entries list once after every insertion.
@ethomson
Copy link
Member

👍

carlosmn added a commit that referenced this pull request Dec 16, 2015
merge: Use `git_index__fill` to populate the index
@carlosmn carlosmn merged commit f824259 into master Dec 16, 2015
@vmg
Copy link
Member Author

vmg commented Dec 16, 2015

Damn, that was some quick code review hmmpf.

@vmg
Copy link
Member Author

vmg commented Dec 16, 2015

By hmmmpf I meant that I was not sure at all of the quality of this implementation (am I right on my assumptions about entries not colliding and being valid?).

Guess it's time to blindly trust @ethomson again :D

@carlosmn
Copy link
Member

This is in a new index, so we won't collide with any old entries. And the staged/successfully-merged entries would only have a single name, so I figured it was fine.

@vmg
Copy link
Member Author

vmg commented Dec 16, 2015

Ayup, I think I agree. I have however noticed that we're not adjusting the length mask on the flags of the index... It doesn't seem like the length mask is used anywhere, but we should probably adjust it nonetheless. And we should also perform filemode normalization, just in case the original filemode from the tree is not valid.

@peff
Copy link
Member

peff commented Dec 16, 2015

Nice catch. Seems like a classic "accidentally quadratic" case.

If you were really worried about collisions, you should be able to do an O(n) scan of the vector after the sort. I'm not sure how concerned to be, since I'm not sure where the entries are coming from. What if you had a tree with duplicate entries? That's not supposed to happen, and it's going to produce a slightly-off answer in the name of speed (at least, that's Git's attitude). As long as it wouldn't send us into an infinite loop or anything. :)

@ethomson ethomson deleted the vmg/index-fill branch January 9, 2019 10:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants