Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

pks-gitlab
Copy link

This pull request introduces support for the "refFormat" extension. On the one hand it introduces support for reading repositories that have this extension, even though we naturally only understand the files format right now. On the other hand this pull request moves the logic to initialize the refdb into the refdb backends so that the logic can be customized for every ref format.

I've decided to pull out these changes into a separate pull request in preparation for the reftable format. It's already somewhat non-trivial, so I think that reviewing it separately might be easier.

@pks-t pks-t requested a review from ethomson July 11, 2025 15:40
@pks-gitlab pks-gitlab force-pushed the pks-refformat-extension branch from ab5f957 to 8d0ff81 Compare July 11, 2025 15:50
@pks-gitlab
Copy link
Author

The Windows failures are all unrelated to my changes, as far as I can see.

pks-t added 5 commits August 4, 2025 16:34
When we read the repository format information we do so by using the
full configuration of that repository. This configuration not only
includes the repository-level configuration though, but it also includes
the global- and system-level configuration. These configurations should
in practice never contain information about which format a specific
repository uses.

Despite this obvious conceptual error there's also a more subtle issue:
reading the full configuration may require us to evaluate conditional
includes. Those conditional includes may themselves require that the
repository format is already populated though. This is for example the
case with the "onbranch" condition: we need to populate the refdb to
evaluate that condition, but to populate the refdb we need to first know
about the repository format.

Fix this by using the repository-level configuration, only, to determine
the repository's format.
To support multiple different reference backend implementations,
Git introduced a "refStorage" extension that stores the reference
storage format a Git client should try to use.

Wire up the logic to read this new extension when we open a repository
from disk. For now, only the "files" backend is supported by us. When
trying to open a repository that has a refstorage format that we don't
understand we now error out.

There are two functions that create a new repository that doesn't really
have references. While those are mostly non-functional when it comes to
references, we do expect that you can access the refdb, even if it's not
yielding any refs. For now we mark those to use the "files" backend, so
that the status quo is retained. Eventually though it might not be the
worst idea to introduce an explicit "in-memory" reference database. But
that is outside the scope of this patch series.
While we only support initializing repositories with the "files"
reference backend right now, we are in the process of implementing a
second backend with the "reftable" format. And while we already have the
infrastructure to decide which format a repository should use when we
open it, we do not have infrastructure yet to create new repositories
with a different reference format.

Introduce a new field `git_repository_init_options::refdb_type`. If
unset, we'll default to the "files" backend. Otherwise though, if set to
a valid `git_refdb_t`, we will use that new format to initialize the
repostiory.

Note that for now the only thing we do is to write the "refStorage"
extension accordingly. What we explicitly don't yet do is to also handle
the backend-specific logic to initialize the refdb on disk. This will be
implemented in subsequent commits.
In our tests for "onbranch" config conditionals we set HEAD to point to
various different branches via `git_repository_create_head()`. This
function circumvents the refdb though and directly writes to the "HEAD"
file. While this works now, it will create problems once we have
multiple refdb backends.

Furthermore, the function is about to go away in the next commit. So
let's prepare for that and use `git_reference_symbolic_create()`
instead.
The initialization of the on-disk state of refdbs is currently not
handled by the actual refdb backend, but it's implemented ad-hoc where
needed. This is problematic once we have multiple different refdbs as
the filesystem structure is of course not the same.

Introduce a new callback function `git_refdb_backend::init()`. If set,
this callback can be invoked via `git_refdb_init()` to initialize the
on-disk state of a refdb. Like this, each backend can decide for itself
how exactly to do this.

Note that the initialization of the refdb is a bit intricate. A
repository is only recognized as such when it has a "HEAD" file as well
as a "refs/" directory. Consequently, regardless of which refdb format
we use, those files must always be present. This also proves to be
problematic for us, as we cannot access the repository and thus don't
have access to the refdb if those files didn't exist.

To work around the issue we thus handle the creation of those files
outside of the refdb-specific logic. We actually use the same strategy
as Git does, and write the invalid reference "ref: refs/heads/.invalid"
into "HEAD". This looks almost like a ref, but the name of that ref
is not valid and should thus trip up Git clients that try to read that
ref in a repository that really uses a different format.

So while that invalid "HEAD" reference will of course get rewritten by
the "files" backend, other backends should just retain it as-is.
@pks-gitlab pks-gitlab force-pushed the pks-refformat-extension branch from 8fcaec5 to 72e29b9 Compare August 4, 2025 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants