-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Recursive Merge #3513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recursive Merge #3513
Conversation
Well, handling conflicts was less bad than I expected. My bias against it was because I use I added some private flags here. I considered making this a public setting (the stage-conflicts file favor type). But then the labels for ancestor / ours / theirs would likely also need to be public (in the merge options). This has a really nice advantage that we can now get rid of the And the So although it seemed like quite the right approach initially, I decided it was not and did this quiet private flags approach that worked out quite cleanly in the end. |
bb938ec
to
d78d738
Compare
@@ -152,7 +160,7 @@ typedef enum { | |||
|
|||
/** Take extra time to find minimal diff */ | |||
GIT_MERGE_FILE_DIFF_MINIMAL = (1 << 7), | |||
} git_merge_file_flags_t; | |||
} git_merge_file_flag_t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name change should also be reflected in the CHANGELOG.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
Add a simple recursive test - where multiple ancestors exist and creating a virtual merge base from them would prevent a conflict.
When the commits to merge have multiple common ancestors, build a "virtual" base tree by merging the common ancestors.
When there are more than two common ancestors, continue merging the virtual base with the additional common ancestors, effectively octopus merging a new virtual base.
Use annotated commits to act as our virtual bases, instead of regular commits, to avoid polluting the odb with virtual base commits and trees. Instead, build an annotated commit with an index and pointers to the commits that it was merged from.
When building a recursive merge base, allow conflicts to occur. Use the file (with conflict markers) as the common ancestor. The user has already seen and dealt with this conflict by virtue of having a criss-cross merge. If they resolved this conflict identically in both branches, then there will be no conflict in the result. This is the best case scenario. If they did not resolve the conflict identically in the two branches, then we will generate a new conflict. If the user is simply using standard conflict output then the results will be fairly sensible. But if the user is using a mergetool or using diff3 output, then the common ancestor will be a conflict file (itself with diff3 output, haha!). This is quite terrible, but it matches git's behavior.
d78d738
to
5b9c63c
Compare
/** see `git_merge_file_flags_t` above */ | ||
unsigned int flags; | ||
/** see `git_merge_file_flag_t` above */ | ||
git_merge_file_flag_t flags; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple options from git_merge_file_flag_t
may be specified, that is the diff style, whitespace options etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and this doesn't preclude that. This is to match the style in the rest of the library where enum types are named in the singular, even when they will be combined. For example: git_repository_open_flag_t
.
When we built merge, we didn't do recursive base-building. We thought it would be super critical (after all,
git-merge-recursive
is the default for git itself) but here we are several years later and we're just now getting to it. We have a customer who does a lot of merging master into their feature branches and then merging back and end up with crazy criss-crosses. And despite our general disapproval with this workflow, it's something that git does better than libgit2, so here we are.Conceptually this is quite straightforward: we still do a three-way merge, but instead of looking for a single common ancestor for two commits, now we look for all the merge bases. If there's more than one, we merge them and use the resultant tree as a merge base. (Of course, those merge bases could have multiple merge bases themselves, hence the name "recursive merge").
But that's not the only way you could end up recursing to build a base - you could also have three merge bases for a pair of commits. In this case, you need to merge the first two, creating a "virtual merge base", then merge that with the third to create a final virtual merge base that you can use as the actual common ancestor.
Because of this, you can't keep only the resultant merge data (the index), you actually need to build a "virtual commit", so that you can keep ancestry (so that you can find the merge base between the new virtual merge base and the third real commit).
My initial thought here was to spin up an in-memory ODB backend (using vmg's mempack backend) and put these virtual commits into that ODB backend. However, this turns out to be pretty unfortunate when you end up trying to write the index into trees, which is super unnecessary work.
So instead, we make
git_annotated_commit
have a "virtual" mode, that contains only parents and agit_index
that represents the contents, and we merge those.Remaining work:
This remaining work is pretty likely to be truly terrible because they're places where git itself assumes that it's going to crap its bad output right out into the working directory with terrible conflict markers and assume that everything will be okay.