Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Recursive Merge #3513

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Nov 30, 2015
Merged

Recursive Merge #3513

merged 17 commits into from
Nov 30, 2015

Conversation

ethomson
Copy link
Member

When we built merge, we didn't do recursive base-building. We thought it would be super critical (after all, git-merge-recursive is the default for git itself) but here we are several years later and we're just now getting to it. We have a customer who does a lot of merging master into their feature branches and then merging back and end up with crazy criss-crosses. And despite our general disapproval with this workflow, it's something that git does better than libgit2, so here we are.

Conceptually this is quite straightforward: we still do a three-way merge, but instead of looking for a single common ancestor for two commits, now we look for all the merge bases. If there's more than one, we merge them and use the resultant tree as a merge base. (Of course, those merge bases could have multiple merge bases themselves, hence the name "recursive merge").

But that's not the only way you could end up recursing to build a base - you could also have three merge bases for a pair of commits. In this case, you need to merge the first two, creating a "virtual merge base", then merge that with the third to create a final virtual merge base that you can use as the actual common ancestor.

Because of this, you can't keep only the resultant merge data (the index), you actually need to build a "virtual commit", so that you can keep ancestry (so that you can find the merge base between the new virtual merge base and the third real commit).

My initial thought here was to spin up an in-memory ODB backend (using vmg's mempack backend) and put these virtual commits into that ODB backend. However, this turns out to be pretty unfortunate when you end up trying to write the index into trees, which is super unnecessary work.

So instead, we make git_annotated_commit have a "virtual" mode, that contains only parents and a git_index that represents the contents, and we merge those.

Remaining work:

  • Recursion limit. At present, there is no limit on the recursion. On a very large tree, there should be limits and we should either fail or just stop recursing and just use the first merge base.
  • Conflicts merging the bases to create a virtual merge base. This is just going to be truly godawful.

This remaining work is pretty likely to be truly terrible because they're places where git itself assumes that it's going to crap its bad output right out into the working directory with terrible conflict markers and assume that everything will be okay.

@ethomson
Copy link
Member Author

Well, handling conflicts was less bad than I expected. My bias against it was because I use diff3 style merges, and conflicts in the recursive base are horrible in that case, but otherwise, it does seem to work out quite neatly.

I added some private flags here. I considered making this a public setting (the stage-conflicts file favor type). But then the labels for ancestor / ours / theirs would likely also need to be public (in the merge options). This has a really nice advantage that we can now get rid of the git_merge_file_options type and just use git_merge_options everywhere. But is has the disadvantage that the git_merge function takes a git_merge_options that specifies these labels... and a git_checkout_options that also specifies these labels.

And the git_checkout_options one is likely the one that will win (since it's actually writing conflict files - the git_merge_options would actually only be used when building the recursive merge base! Very confusing.

So although it seemed like quite the right approach initially, I decided it was not and did this quiet private flags approach that worked out quite cleanly in the end.

@ethomson ethomson force-pushed the merge_recursive branch 2 times, most recently from bb938ec to d78d738 Compare November 21, 2015 16:22
@@ -152,7 +160,7 @@ typedef enum {

/** Take extra time to find minimal diff */
GIT_MERGE_FILE_DIFF_MINIMAL = (1 << 7),
} git_merge_file_flags_t;
} git_merge_file_flag_t;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name change should also be reflected in the CHANGELOG.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

ethomson and others added 17 commits November 25, 2015 15:37
Add a simple recursive test - where multiple ancestors exist and
creating a virtual merge base from them would prevent a conflict.
When the commits to merge have multiple common ancestors, build a
"virtual" base tree by merging the common ancestors.
When there are more than two common ancestors, continue merging the
virtual base with the additional common ancestors, effectively
octopus merging a new virtual base.
Use annotated commits to act as our virtual bases, instead of regular
commits, to avoid polluting the odb with virtual base commits and
trees.  Instead, build an annotated commit with an index and pointers
to the commits that it was merged from.
When building a recursive merge base, allow conflicts to occur.
Use the file (with conflict markers) as the common ancestor.

The user has already seen and dealt with this conflict by virtue
of having a criss-cross merge.  If they resolved this conflict
identically in both branches, then there will be no conflict in the
result.  This is the best case scenario.

If they did not resolve the conflict identically in the two branches,
then we will generate a new conflict.  If the user is simply using
standard conflict output then the results will be fairly sensible.
But if the user is using a mergetool or using diff3 output, then the
common ancestor will be a conflict file (itself with diff3 output,
haha!).  This is quite terrible, but it matches git's behavior.
/** see `git_merge_file_flags_t` above */
unsigned int flags;
/** see `git_merge_file_flag_t` above */
git_merge_file_flag_t flags;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple options from git_merge_file_flag_t may be specified, that is the diff style, whitespace options etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and this doesn't preclude that. This is to match the style in the rest of the library where enum types are named in the singular, even when they will be combined. For example: git_repository_open_flag_t.

carlosmn added a commit that referenced this pull request Nov 30, 2015
@carlosmn carlosmn merged commit a27f31d into libgit2:master Nov 30, 2015
@ethomson ethomson deleted the merge_recursive branch February 29, 2016 20:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants