-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Add GIT_REPOSITORY_OPEN_FROM_ENV flag to respect $GIT_* environment vars #3711
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GIT_REPOSITORY_OPEN_FROM_ENV flag to respect $GIT_* environment vars #3711
Conversation
cl_setenv("GIT_CEILING_DIRECTORIES", ceiling_dirs); | ||
cl_git_pass(git_repository_discover_default(&found_path)); | ||
cl_setenv("GIT_DIR", NULL); | ||
cl_setenv("GIT_CEILING_DIRECTORIES", NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably move these into the test cleanup code - that way they will get unset properly even if git_repository_discover_default
fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
This looks good to me. I like the careful error handling. :) |
42b595b
to
31c429e
Compare
Looks like one of the Travis builds spuriously failed. Retrying... |
31c429e
to
795309e
Compare
OK, I updated the test to add a cleanup that unsets the environment variables. |
This is very different from how the rest of the library operates. This would be the first place where we ourselves act on git environment variables so we need to be careful in how we introduce this kind of thing. We do act on This does not act on the By defaulting to Why stop at these variables? Why not look at |
@carlosmn Building a command-line application similar to those provided by git itself seems like one of the primary use cases of libgit2. Such applications should, by default, act on the standard git environment variables the same way git does, so that they fit in. I absolutely do agree that libgit2 should never force the use of these environment variables, as that would break other use cases; however, it seems appropriate to me to provide a function that handles one of the most common use cases correctly, to avoid having to reimplement that functionality (potentially incorrectly) in each client. Right now, it's easier to ignore variables like $GIT_CEILING_DIRECTORIES than to implement support for them; I'd like to make it easier to use them than to ignore them, so that command-line tools built on libgit2 will likely respect them. Anyone who wants to ignore all of these variables can still call
Good catch; I can easily fix that. I can just pass GIT_REPOSITORY_OPEN_NO_SEARCH if $GIT_DIR was set. I'll update the patch and the tests to fix that.
This function is explicitly supposed to be "open the default repository that a git command-line tool would open". So, it should match git's behavior of searching from the current directory up to the ceiling dirs. I agree that other functions in the library should not care about the current directory (though it's still possible to pass them relative paths), but in this case...
I'd like to do so. $GIT_WORK_TREE and $GIT_INDEX_FILE seem trivial to handle, actually. I should change this function to
This is intended to be the "give me the repository a git command-line tool would open" function; it should be general across all tools that behave like those git itself ships, which seems like a sufficiently common use case for libgit2. To the extent that this isn't sufficiently general, I can and should fix that. :) |
I agree with @joshtriplett's rationale. As long as libgit2 doesn't take into account the environment variables by default, adding a helper API that mimic's Git's convention seems useful. The alternative is making many users of the library re-write the exact same logic. That's shitty for the users. @ethomson care to weight in? |
I'm mostly in agreement. I think that this is a useful addition to the library, provided it has git-like semantics (and it seems like we're all in agreement that it should).
I don't like calling this the "default repository", it makes it sound like there is a default directory, when this is really the repository based on the current working directory. This may not be a significant distinction if you're writing a CLI, but the concept of a "default repository" doesn't make much sense to anybody who's not.
We're very flexible about object databases and this should be quite supportable with the existing code. |
Sure, I do agree that's a useful functionality. But that's not what the function does. It only covers one small aspect. If you want to do that, you'd still need to reimplement the logic even with this function available. Which is why I said that it felt like it came out of a specific use-case. Name-wise, I'd rather go with something like I would still prefer it if it accepted the path to take as the basis instead of assuming you would always want the cwd (you can accept |
I'll update the function to take more of the git environment variables into account, as well as the correct |
795309e
to
7846932
Compare
Pushing a WIP version for review; I still need to write test cases. I've added support for all the standard git environment variables: |
Do you want libgit2 to respect any environment variables? |
7846932
to
86ee304
Compare
Updated with extensive testcases, covering all the supported environment variables. In the course of developing these test cases, I found a bug in the existing ceiling_dirs handling in |
Any thoughts on this approach, either on the API or on this implementation of it? |
if (odb) | ||
git_repository_set_odb(repo, odb); | ||
|
||
if (git_buf_is_allocated(&alts_buf)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure that I understand this construct. git_buf_size(&alts_buf) > 0
might make more sense? But I think the best option is to move the git__gitenv
down here to keep the same pattern as the above variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
git_buf_is_allocated
checks whether the buffer has been initialized.
In this function, I kept all the git__gitenv
calls and associated errors together and above the git_repository_open_ext
, so that any error obtaining the environment variable would fail early and not have to clean up the open repository. However, setting up alternatives requires the repository.
I can move the call, though; the order of error handling doesn't really matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, my point about is_allocated
is that it's not quite what you're interested in; it measures whether the library was the one to allocate a git_buf
(as opposed to an external caller). It doesn't measure whether a git_buf
has contents in it.
Are you preferring this because you want to support environment variables that are an empty string? If so this is effective but feels a bit fragile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Supporting empty strings was part of it; I used is_allocated
as a "has this been set by git__getenv
" check. It returns false for GIT_BUF_INIT
and true for the value produced by git__getenv
. That doesn't seem fragile.
I think it'll be easiest to just move the git__getenv
call.
I'm not in favor of the name Something like Why can't we simply make |
Something like this? git_repository_open_ext(&repo, NULL, GIT_REPOSITORY_OPEN_RESPECT_ENV, NULL); I could do that, but I think it'd complicate The intention was to make this new function the obvious call to make if you want to open the same repository that the I can certainly rename the function; |
No, I was suggesting adding This gives us flexibility in the future if we (for example) add some We could of course go ahead and take Just a thought. |
@ethomson I had that exact set of three flags in mind when I mentioned that none of them made sense together with this new one. But forward compatibility makes sense, sure. It does make the common case a bit less clean, but it still seems workable. I'll add the new flag, and document the (non-)interaction between it and the other flags. |
Yeah, it's tough to know whether this is really worthwhile or not (how much are we really going to add to (I think a totally reasonable thing is just to change the existing function and the new function to be internal and make |
git only checks ceiling directories when its search ascends to a parent directory. A ceiling directory matching the starting directory will not prevent git from finding a repository in the starting directory or a parent directory. libgit2 handled the former case correctly, but differed from git in the latter case: given a ceiling directory matching the starting directory, but no repository at the starting directory, libgit2 would stop the search at that point rather than finding a repository in a parent directory. Test case using git command-line tools: /tmp$ git init x Initialized empty Git repository in /tmp/x/.git/ /tmp$ cd x/ /tmp/x$ mkdir subdir /tmp/x$ cd subdir/ /tmp/x/subdir$ GIT_CEILING_DIRECTORIES=/tmp/x git rev-parse --git-dir fatal: Not a git repository (or any of the parent directories): .git /tmp/x/subdir$ GIT_CEILING_DIRECTORIES=/tmp/x/subdir git rev-parse --git-dir /tmp/x/.git Fix the testsuite to test this case (in one case fixing a test that depended on the current behavior), and then fix find_repo to handle this case correctly. In the process, simplify and document the logic in find_repo(): - Separate the concepts of "currently checking a .git directory" and "number of iterations left before going further counts as a search" into two separate variables, in_dot_git and min_iterations. - Move the logic to handle in_dot_git and append /.git to the top of the loop. - Only search ceiling_dirs and find ceiling_offset after running out of min_iterations; since ceiling_offset only tracks the longest matching ceiling directory, if ceiling_dirs contained both the current directory and a parent directory, this change makes find_repo stop the search at the parent directory.
GIT_REPOSITORY_OPEN_NO_SEARCH does not search up through parent directories, but still tries the specified path both directly and with /.git appended. GIT_REPOSITORY_OPEN_BARE avoids appending /.git, but opens the repository in bare mode even if it has a working directory. To support the semantics git uses when given $GIT_DIR in the environment, provide a new GIT_REPOSITORY_OPEN_NO_DOTGIT flag to not try appending /.git.
git_repository_open_ext provides parameters for the start path, whether to search across filesystems, and what ceiling directories to stop at. git commands have standard environment variables and defaults for each of those, as well as various other parameters of the repository. To avoid duplicate environment variable handling in users of libgit2, add a GIT_REPOSITORY_OPEN_FROM_ENV flag, which makes git_repository_open_ext automatically handle the appropriate environment variables. Commands that intend to act just like those built into git itself can use this flag to get the expected default behavior. git_repository_open_ext with the GIT_REPOSITORY_OPEN_FROM_ENV flag respects $GIT_DIR, $GIT_DISCOVERY_ACROSS_FILESYSTEM, $GIT_CEILING_DIRECTORIES, $GIT_INDEX_FILE, $GIT_NAMESPACE, $GIT_OBJECT_DIRECTORY, and $GIT_ALTERNATE_OBJECT_DIRECTORIES. In the future, when libgit2 gets worktree support, git_repository_open_env will also respect $GIT_WORK_TREE and $GIT_COMMON_DIR; until then, git_repository_open_ext with this flag will error out if either $GIT_WORK_TREE or $GIT_COMMON_DIR is set.
86ee304
to
0dd98b6
Compare
I think this makes sense - @carlosmn do you like the way this turned out? |
Okay, one last request - would you mind documenting this in the |
@ethomson Will do. |
Document GIT_REPOSITORY_OPEN_NO_DOTGIT and GIT_REPOSITORY_OPEN_FROM_ENV.
@ethomson Done. |
…over_default Add GIT_REPOSITORY_OPEN_FROM_ENV flag to respect $GIT_* environment vars (cherry picked from commit ebeb56f)
git_repository_open_ext provides parameters for the start path, whether to search across filesystems, and what ceiling directories to stop at. git commands have standard environment variables and defaults for each of those, as well as various other parameters of the repository. To avoid duplicate environment variable handling in users of libgit2, add a GIT_REPOSITORY_OPEN_FROM_ENV flag, which makes git_repository_open_ext automatically handle the appropriate environment variables. Commands that intend to act just like those built into git itself can use this flag to get the expected default behavior.
git_repository_open_ext with the GIT_REPOSITORY_OPEN_FROM_ENV flag respects $GIT_DIR, $GIT_DISCOVERY_ACROSS_FILESYSTEM, $GIT_CEILING_DIRECTORIES, $GIT_INDEX_FILE, $GIT_NAMESPACE, $GIT_OBJECT_DIRECTORY, and $GIT_ALTERNATE_OBJECT_DIRECTORIES. In the future, when libgit2 gets worktree support, git_repository_open_env will also respect $GIT_WORK_TREE and $GIT_COMMON_DIR; until then, git_repository_open_ext with this flag will error out if either $GIT_WORK_TREE or $GIT_COMMON_DIR is set.