Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@ezekielnewren
Copy link
Contributor

@ezekielnewren ezekielnewren commented Oct 15, 2025

Changes in v2:

  • Added documentation about unambiguous types and FFI
  • Addressed comments on the mailing list

Original cover letter below:

Maintainer note: This patch series builds on top of en/xdiff-cleanup
and am/xdiff-hash-tweak (both of which are now in master).

The primary goal of this patch series is to
convert every field's type in xrecord_t and xdfile_t to be
unambiguous, in preparation to make it more Rust FFI friendly.
Additionally the ha field in xrecord_t is split into
line_hash and minimal_perfect hash.

The order of some of the fields has changed as called out by the commit
messages.

Before:

typedef struct s_xrecord {
	char const *ptr;
	long size;
	unsigned long ha;
} xrecord_t;

typedef struct s_xdfile {
	xrecord_t *recs;
	long nrec;
	long dstart, dend;
	bool *changed;
	long *rindex;
	long nreff;
} xdfile_t;

After part 2

typedef struct s_xrecord {
	uint8_t const *ptr;
	size_t size;
	uint64_t line_hash;
	size_t minimal_perfect_hash;
} xrecord_t;

typedef struct s_xdfile {
	xrecord_t *recs;
	size_t nrec;
	bool *changed;
	size_t *reference_index;
	size_t nreff;
	ssize_t dstart, dend;
} xdfile_t;

cc: "Kristoffer Haugsbakk" [email protected]
cc: Patrick Steinhardt [email protected]
cc: Phillip Wood [email protected]
cc: Chris Torek [email protected]

@gitgitgadget-git
Copy link

There are issues in commit 02be002:
xdiff: use ssize_t for dstart/dend, make them last in xdfile_t
Commit not signed off

@gitgitgadget-git
Copy link

There are issues in commit ad57e97:
xdiff: use size_t for xrecord_t.size
Commit not signed off

@gitgitgadget-git
Copy link

There are issues in commit 08d66ad:
xdiff: use unambiguous types in xdl_hash_record()
Commit checks stopped - the message is too short
Commit not signed off

@gitgitgadget-git
Copy link

There are issues in commit d27493d:
xdiff: change rindex from long to size_t in xdfile_t
Commit not signed off

@gitgitgadget-git
Copy link

There are issues in commit 77cca9e:
xdiff: rename rindex -> reference_index
Commit not signed off

@ezekielnewren
Copy link
Contributor Author

/submit

@gitgitgadget-git
Copy link

Submitted as [email protected]

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-git-2070/ezekielnewren/xdiff_cleanup_part2-v1

To fetch this version to local tag pr-git-2070/ezekielnewren/xdiff_cleanup_part2-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-git-2070/ezekielnewren/xdiff_cleanup_part2-v1

@gitgitgadget-git
Copy link

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Ezekiel Newren via GitGitGadget" <[email protected]> writes:

> The primary goal of this patch series is to convert every field's type in
> xrecord_t and xdfile_t to be unambiguous, in preparation to make it more
> Rust FFI friendly. Additionally the ha field in xrecord_t is split into
> line_hash and minimal_perfect hash.
>
> The order of some of the fields has changed as called out by the commit
> messages.
>
> Before:
>
> typedef struct s_xrecord {
> 	char const *ptr;
> 	long size;
> 	unsigned long ha;
> } xrecord_t;
>
> typedef struct s_xdfile {
> 	xrecord_t *recs;
> 	long nrec;
> 	long dstart, dend;
> 	bool *changed;
> 	long *rindex;
> 	long nreff;
> } xdfile_t;
>
>
> After part 2
>
> typedef struct s_xrecord {
> 	uint8_t const *ptr;
> 	size_t size;
> 	uint64_t line_hash;
> 	size_t minimal_perfect_hash;
> } xrecord_t;
>
> typedef struct s_xdfile {
> 	xrecord_t *recs;
> 	size_t nrec;
> 	bool *changed;
> 	size_t *reference_index;
> 	size_t nreff;
> 	ssize_t dstart, dend;
> } xdfile_t;

Excellent summary.

>
>
> Ezekiel Newren (9):
>   xdiff: use ssize_t for dstart/dend, make them last in xdfile_t
>   xdiff: make xrecord_t.ptr a uint8_t instead of char
>   xdiff: use size_t for xrecord_t.size
>   xdiff: use unambiguous types in xdl_hash_record()
>   xdiff: split xrecord_t.ha into line_hash and minimal_perfect_hash
>   xdiff: make xdfile_t.nrec a size_t instead of long
>   xdiff: make xdfile_t.nreff a size_t instead of long
>   xdiff: change rindex from long to size_t in xdfile_t
>   xdiff: rename rindex -> reference_index
>
>  xdiff-interface.c  |  2 +-
>  xdiff/xdiffi.c     | 29 +++++++++++------------
>  xdiff/xemit.c      | 28 +++++++++++-----------
>  xdiff/xhistogram.c |  4 ++--
>  xdiff/xmerge.c     | 30 ++++++++++++------------
>  xdiff/xpatience.c  | 14 +++++------
>  xdiff/xprepare.c   | 58 +++++++++++++++++++++++-----------------------
>  xdiff/xtypes.h     | 15 ++++++------
>  xdiff/xutils.c     | 32 ++++++++++++-------------
>  xdiff/xutils.h     |  6 ++---
>  10 files changed, 109 insertions(+), 109 deletions(-)
>
>
> base-commit: 143f58ef7535f8f8a80d810768a18bdf3807de26
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2070%2Fezekielnewren%2Fxdiff_cleanup_part2-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2070/ezekielnewren/xdiff_cleanup_part2-v1
> Pull-Request: https://github.com/git/git/pull/2070

@gitgitgadget-git
Copy link

This patch series was integrated into seen via 6162146.

static int get_indent(xrecord_t *rec)
{
long i;
int ret = 0;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, "Kristoffer Haugsbakk" wrote (reply to this):

On Wed, Oct 15, 2025, at 23:18, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <[email protected]>
>
> Rust uses u8 to refer to bytes in memory. Since xrecord_t.ptr is also
> referring to bytes in memory, rather than unicode code points, use

s/unicode/Unicode/

> uint8_t instead of char.
>
> Signed-off-by: Ezekiel Newren <[email protected]>
> ---
>[snip]

@gitgitgadget-git
Copy link

User "Kristoffer Haugsbakk" <[email protected]> has been added to the cc: list.

@gitgitgadget-git
Copy link

This branch is now known as en/xdiff-cleanup-2.

@gitgitgadget-git
Copy link

This patch series was integrated into seen via 3aa8f9e.

@gitgitgadget-git
Copy link

This patch series was integrated into seen via 112aa0a.

@gitgitgadget-git
Copy link

There was a status update in the "New Topics" section about the branch en/xdiff-cleanup-2 on the Git mailing list:

Code clean-up.

Comments?
source: <[email protected]>

@gitgitgadget-git
Copy link

This patch series was integrated into seen via 70b4e37.

* Davide Libenzi <[email protected]>
*
*/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ezekiel Newren wrote (reply to this):

On Wed, Oct 15, 2025 at 3:18 PM Ezekiel Newren via GitGitGadget
<[email protected]> wrote:
>
> From: Ezekiel Newren <[email protected]>
>
> The ha field is serving two different purposes, which makes the code
> harder to read. At first glance it looks like many places assume
> there could never be hash collisions between lines of the two input
> files. In reality, line_hash is used together with xdl_recmatch() to
> ensure correct comparisons of lines, even when collisions occur.
>
> To make this clearer, the old ha field has been split:
>   * line_hash: The straightforward hash of a line, requiring no
>     additional context.
>   * minimal_perfect_hash: Not a new concept, but now a separate
>     field. It comes from the classifier's general-purpose hash table,
>     which assigns each line a unique and minimal hash across the two
>     files.
>
> Signed-off-by: Ezekiel Newren <[email protected]>

I'm a bit surprised that nobody has commented on this patch. I thought
that someone would have criticized the length of the name
"minimal_perfect_hash" or asked me why I was splitting one field into
two.

I don't see any reason why this patch series shouldn't move forward.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Ezekiel Newren <[email protected]> writes:

> I'm a bit surprised that nobody has commented on this patch. I thought
> that someone would have criticized the length of the name
> "minimal_perfect_hash" or asked me why I was splitting one field into
> two.

Sometimes there aren't enough round tuits to go around, and when
people have been too busy to review it, we see no comment, either
positive ones or negative ones.

> I don't see any reason why this patch series shouldn't move forward.

A patch series needs a positive reason to move forward;
unfortunately we cannot tell much from lack of negative comments.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Mon, Oct 20, 2025 at 05:29:25PM -0600, Ezekiel Newren wrote:
> On Wed, Oct 15, 2025 at 3:18 PM Ezekiel Newren via GitGitGadget
> <[email protected]> wrote:
> >
> > From: Ezekiel Newren <[email protected]>
> >
> > The ha field is serving two different purposes, which makes the code
> > harder to read. At first glance it looks like many places assume
> > there could never be hash collisions between lines of the two input
> > files. In reality, line_hash is used together with xdl_recmatch() to
> > ensure correct comparisons of lines, even when collisions occur.
> >
> > To make this clearer, the old ha field has been split:
> >   * line_hash: The straightforward hash of a line, requiring no
> >     additional context.
> >   * minimal_perfect_hash: Not a new concept, but now a separate
> >     field. It comes from the classifier's general-purpose hash table,
> >     which assigns each line a unique and minimal hash across the two
> >     files.
> >
> > Signed-off-by: Ezekiel Newren <[email protected]>
> 
> I'm a bit surprised that nobody has commented on this patch. I thought
> that someone would have criticized the length of the name
> "minimal_perfect_hash" or asked me why I was splitting one field into
> two.

I actually appreciate the longer name. I'm not a fan of abbreviations
that are hard to understand myself. Sure, they are easier to type, but
in many cases they end up making the code way harder to understand if
you are not deeply familiar with it. There's of course exceptions to
this, but I don't really think that your patch falls into them.

Patrick

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Phillip Wood wrote (reply to this):

Hi Ezekiel

On 21/10/2025 00:29, Ezekiel Newren wrote:
> On Wed, Oct 15, 2025 at 3:18 PM Ezekiel Newren via GitGitGadget
> <[email protected]> wrote:
>>
>> From: Ezekiel Newren <[email protected]>
>>
>> The ha field is serving two different purposes, which makes the code
>> harder to read. At first glance it looks like many places assume
>> there could never be hash collisions between lines of the two input
>> files. In reality, line_hash is used together with xdl_recmatch() to
>> ensure correct comparisons of lines, even when collisions occur.
>>
>> To make this clearer, the old ha field has been split:
>>    * line_hash: The straightforward hash of a line, requiring no
>>      additional context.
>>    * minimal_perfect_hash: Not a new concept, but now a separate
>>      field. It comes from the classifier's general-purpose hash table,
>>      which assigns each line a unique and minimal hash across the two
>>      files.
>>
>> Signed-off-by: Ezekiel Newren <[email protected]>
> > I'm a bit surprised that nobody has commented on this patch.

I've been off the list and I haven't caught up with this series yet.

> I thought
> that someone would have criticized the length of the name
> "minimal_perfect_hash" or asked me why I was splitting one field into
> two.

I think "perfect_hash" would be fine if we want a shorter name. More importantly it would be helpful to explain why the two fields have different types. I assume it is because the perfect_hash is used as an array index and therefore size_t is a better match for rust's usize than uint64_t. How much more memory do we end up using by adding second hash member to the struct? If the aim is to show that only one of them is used at a time then a union might be more appropriate but I doubt that plays well with rust.

I'll try and have a look at the other patches later this week. I think the type changes are going to need careful review.

Thanks

Phillip

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Chris Torek wrote (reply to this):

On Tue, Oct 21, 2025 at 3:04 AM Phillip Wood <[email protected]> wrote:
...
> uint64_t. How much more memory do we end up using by adding second hash
> member to the struct?

As in any string-to-string algorithm of this sort, there's one per "symbol",
but in this case a "symbol" is a line in a file. So if files are M and N lines
long, there are M+N symbols. Take the difference of the size of the two
records and multiply by this.

Assuming "sane" input file sizes (under a million lines each) it's a few
megabytes maximum...

Chris

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ezekiel Newren wrote (reply to this):

On Tue, Oct 21, 2025 at 4:03 AM Phillip Wood <[email protected]> wrote:
>
> Hi Ezekiel
>
> On 21/10/2025 00:29, Ezekiel Newren wrote:
> > On Wed, Oct 15, 2025 at 3:18 PM Ezekiel Newren via GitGitGadget
> > <[email protected]> wrote:
> >>
> >> From: Ezekiel Newren <[email protected]>
> >>
> >> The ha field is serving two different purposes, which makes the code
> >> harder to read. At first glance it looks like many places assume
> >> there could never be hash collisions between lines of the two input
> >> files. In reality, line_hash is used together with xdl_recmatch() to
> >> ensure correct comparisons of lines, even when collisions occur.
> >>
> >> To make this clearer, the old ha field has been split:
> >>    * line_hash: The straightforward hash of a line, requiring no
> >>      additional context.
> >>    * minimal_perfect_hash: Not a new concept, but now a separate
> >>      field. It comes from the classifier's general-purpose hash table,
> >>      which assigns each line a unique and minimal hash across the two
> >>      files.
> >>
> >> Signed-off-by: Ezekiel Newren <[email protected]>
> >
> > I'm a bit surprised that nobody has commented on this patch.
>
> I've been off the list and I haven't caught up with this series yet.
>
> > I thought
> > that someone would have criticized the length of the name
> > "minimal_perfect_hash" or asked me why I was splitting one field into
> > two.
>
> I think "perfect_hash" would be fine if we want a shorter name. More
> importantly it would be helpful to explain why the two fields have
> different types. I assume it is because the perfect_hash is used as an
> array index and therefore size_t is a better match for rust's usize than
> uint64_t.

Your understanding is correct. line_hash is fixed width while
minimal_perfect_hash is meant to be used as an array index into
memory. I'll update my commit message to make this more clear.

> How much more memory do we end up using by adding second hash
> member to the struct? If the aim is to show that only one of them is
> used at a time then a union might be more appropriate but I doubt that
> plays well with rust.

xrecord_t used to be defined with a pointer, so we're at the same
size. But more importantly I plan on splitting minimal_perfect_hash
out of xrecord_t into its own array. I think the diff algorithms end
up being a little bit faster with a separate array because each
element is only 8 bytes instead of 32.

In v2.51.0:
typedef struct s_xrecord {
       struct s_xrecord *next;
       char const *ptr;
       long size;
       unsigned long ha;
} xrecord_t;

This patch series:
typedef struct s_xrecord {
       uint8_t const *ptr;
       size_t size;
       uint64_t line_hash;
       size_t minimal_perfect_hash;
} xrecord_t;

> I'll try and have a look at the other patches later this week. I think
> the type changes are going to need careful review.

I appreciate the careful review. I figured it would be best to limit
the scope of this patch series to type changes, so that it wasn't
bogged down by other stuff.

static int get_indent(xrecord_t *rec)
{
long i;
int ret = 0;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Wed, Oct 15, 2025 at 09:18:14PM +0000, Ezekiel Newren via GitGitGadget wrote:
> diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
> index 6f3998ee54..411a8aa69f 100644
> --- a/xdiff/xdiffi.c
> +++ b/xdiff/xdiffi.c
> @@ -993,11 +993,11 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags)
>  
>  		rec = &xe->xdf1.recs[xch->i1];
>  		for (i = 0; i < xch->chg1 && ignore; i++)
> -			ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
> +			ignore = xdl_blankline((const char *)rec[i].ptr, rec[i].size, flags);
>  
>  		rec = &xe->xdf2.recs[xch->i2];
>  		for (i = 0; i < xch->chg2 && ignore; i++)
> -			ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
> +			ignore = xdl_blankline((const char *)rec[i].ptr, rec[i].size, flags);
>  
>  		xch->ignore = ignore;
>  	}

Okay. Seemingly, we convert the structure itself, but we don't convert
any of the functions to accept an `uint8_t`. I guess you drew the line
here so that we don't have to also touch up dozens of function
signatures?

And how did you end up verifying that you added all casts? Does the
compiler flag those as warnings?

In any case, it might be nice to explain both of these details in the
commit message.

Patrick

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ezekiel Newren wrote (reply to this):

On Tue, Oct 21, 2025 at 2:33 AM Patrick Steinhardt <[email protected]> wrote:
>
> On Wed, Oct 15, 2025 at 09:18:14PM +0000, Ezekiel Newren via GitGitGadget wrote:
> > diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
> > index 6f3998ee54..411a8aa69f 100644
> > --- a/xdiff/xdiffi.c
> > +++ b/xdiff/xdiffi.c
> > @@ -993,11 +993,11 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags)
> >
> >               rec = &xe->xdf1.recs[xch->i1];
> >               for (i = 0; i < xch->chg1 && ignore; i++)
> > -                     ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
> > +                     ignore = xdl_blankline((const char *)rec[i].ptr, rec[i].size, flags);
> >
> >               rec = &xe->xdf2.recs[xch->i2];
> >               for (i = 0; i < xch->chg2 && ignore; i++)
> > -                     ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
> > +                     ignore = xdl_blankline((const char *)rec[i].ptr, rec[i].size, flags);
> >
> >               xch->ignore = ignore;
> >       }
>
> Okay. Seemingly, we convert the structure itself, but we don't convert
> any of the functions to accept an `uint8_t`. I guess you drew the line
> here so that we don't have to also touch up dozens of function
> signatures?

That is correct. I wanted to avoid _boiling the ocean_ just to change
the type of ptr.

> And how did you end up verifying that you added all casts? Does the
> compiler flag those as warnings?

I used CLion to search for all uses of that field and then added casts
where the types differ. Another way to do that is to run `make
DEVELOPER=1` and address all of the `uint8_t differs in signedness
from char` errors that are spat out.

> In any case, it might be nice to explain both of these details in the
> commit message.

I will update it.

Thanks.

@gitgitgadget-git
Copy link

User Patrick Steinhardt <[email protected]> has been added to the cc: list.

xecfg->find_func_priv = NULL;
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Wed, Oct 15, 2025 at 09:18:16PM +0000, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <[email protected]>

This should have a commit message explaining what exactly you're doing
here.

Patrick

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ezekiel Newren wrote (reply to this):

On Tue, Oct 21, 2025 at 2:33 AM Patrick Steinhardt <[email protected]> wrote:
>
> On Wed, Oct 15, 2025 at 09:18:16PM +0000, Ezekiel Newren via GitGitGadget wrote:
> > From: Ezekiel Newren <[email protected]>
>
> This should have a commit message explaining what exactly you're doing
> here.

I thought I did have a commit message justifying my changes. Maybe it
got deleted through a rebase. How about a message like:

Convert the function signature and body to use unambiguous types. char
is changed to uint8_t because this function processes bytes in memory.
unsigned long to uint64_t so that the hash output is consistent across
platforms. `flags` was changed from long to uint64_t to ensure the
high order bits are not dropped on platforms that treat long as 32
bits.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Wed, Oct 22, 2025 at 03:20:32PM -0600, Ezekiel Newren wrote:
> On Tue, Oct 21, 2025 at 2:33 AM Patrick Steinhardt <[email protected]> wrote:
> >
> > On Wed, Oct 15, 2025 at 09:18:16PM +0000, Ezekiel Newren via GitGitGadget wrote:
> > > From: Ezekiel Newren <[email protected]>
> >
> > This should have a commit message explaining what exactly you're doing
> > here.
> 
> I thought I did have a commit message justifying my changes. Maybe it
> got deleted through a rebase. How about a message like:
> 
> Convert the function signature and body to use unambiguous types. char
> is changed to uint8_t because this function processes bytes in memory.
> unsigned long to uint64_t so that the hash output is consistent across
> platforms. `flags` was changed from long to uint64_t to ensure the
> high order bits are not dropped on platforms that treat long as 32
> bits.

Works for me, I guess. Thanks!

Patrick

} xrecord_t;

typedef struct s_xdfile {
xrecord_t *recs;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Wed, Oct 15, 2025 at 09:18:20PM +0000, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <[email protected]>
> 
> rindex describes a index offset which means it's an index into memory
> which should use size_t. dstart and dend will be deleted in a future
> patch series. Move them to the end to help avoid refactor conflicts.

In a patch like this I would appreciate some explanation why we can
change the type without adapting any of its users. So basically explain
why this refactoring is safe to do and won't cause any issues.

Patrick

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ezekiel Newren wrote (reply to this):

On Tue, Oct 21, 2025 at 2:34 AM Patrick Steinhardt <[email protected]> wrote:
>
> On Wed, Oct 15, 2025 at 09:18:20PM +0000, Ezekiel Newren via GitGitGadget wrote:
> > From: Ezekiel Newren <[email protected]>
> >
> > rindex describes a index offset which means it's an index into memory
> > which should use size_t. dstart and dend will be deleted in a future
> > patch series. Move them to the end to help avoid refactor conflicts.
>
> In a patch like this I would appreciate some explanation why we can
> change the type without adapting any of its users. So basically explain
> why this refactoring is safe to do and won't cause any issues.

The values of rindex are only used in 3 places. get_hash() which was
created in [1]. and 2 places in xdl_recs_cmp(). All of them use rindex
as an index into another array directly so there's no cascading
refactor impact. get_hash() was created precisely to reduce refactor
churn. How about a commit message like:

Changing the type of rindex from long to size_t has no cascading
refactor impact because it is only ever used to directly index other
arrays.

[1] create get_hash()
https://lore.kernel.org/git/637d1032abbd33b7673d3c101267816fbf1a343c.1758926520.git.gitgitgadget@gmail.com/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Wed, Oct 22, 2025 at 04:14:42PM -0600, Ezekiel Newren wrote:
> On Tue, Oct 21, 2025 at 2:34 AM Patrick Steinhardt <[email protected]> wrote:
> >
> > On Wed, Oct 15, 2025 at 09:18:20PM +0000, Ezekiel Newren via GitGitGadget wrote:
> > > From: Ezekiel Newren <[email protected]>
> > >
> > > rindex describes a index offset which means it's an index into memory
> > > which should use size_t. dstart and dend will be deleted in a future
> > > patch series. Move them to the end to help avoid refactor conflicts.
> >
> > In a patch like this I would appreciate some explanation why we can
> > change the type without adapting any of its users. So basically explain
> > why this refactoring is safe to do and won't cause any issues.
> 
> The values of rindex are only used in 3 places. get_hash() which was
> created in [1]. and 2 places in xdl_recs_cmp(). All of them use rindex
> as an index into another array directly so there's no cascading
> refactor impact. get_hash() was created precisely to reduce refactor
> churn. How about a commit message like:
> 
> Changing the type of rindex from long to size_t has no cascading
> refactor impact because it is only ever used to directly index other
> arrays.

Sounds good to me, thanks!

Patrick

@gitgitgadget-git
Copy link

User Phillip Wood <[email protected]> has been added to the cc: list.

@gitgitgadget-git
Copy link

User Chris Torek <[email protected]> has been added to the cc: list.

unsigned long ha;
} xrecord_t;

typedef struct s_xdfile {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Phillip Wood wrote (reply to this):

On 15/10/2025 22:18, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <[email protected]>
> > ssize_t is appropriate for dstart and dend because they both describe
> positive or negative offsets relative to a pointer.

Isn't ptrdiff_t the appropriate type for an offset to a pointer? ssize_t is not guaranteed to be the same width as size_t (this has caused problems in the past[1]) and is only defined by POSIX, not the C standard.

Thanks

Phillip

[1] https://lore.kernel.org/git/[email protected]/

> A future patch will move these fields to a different struct. Moving
> them to the end of xdfile_t now, means the field order of xdfile_t will
> be disturbed less.
> > Signed-off-by: Ezekiel Newren <[email protected]>
> ---
>   xdiff/xtypes.h | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> > diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
> index f145abba3e..3514bb1684 100644
> --- a/xdiff/xtypes.h
> +++ b/xdiff/xtypes.h
> @@ -47,10 +47,10 @@ typedef struct s_xrecord {
>   typedef struct s_xdfile {
>   	xrecord_t *recs;
>   	long nrec;
> -	long dstart, dend;
>   	bool *changed;
>   	long *rindex;
>   	long nreff;
> +	ssize_t dstart, dend;
>   } xdfile_t;
>   >   typedef struct s_xdfenv {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Phillip Wood <[email protected]> writes:

> On 15/10/2025 22:18, Ezekiel Newren via GitGitGadget wrote:
>> From: Ezekiel Newren <[email protected]>
>> 
>> ssize_t is appropriate for dstart and dend because they both describe
>> positive or negative offsets relative to a pointer.
>
> Isn't ptrdiff_t the appropriate type for an offset to a pointer? ssize_t 
> is not guaranteed to be the same width as size_t (this has caused 
> problems in the past[1]) and is only defined by POSIX, not the C standard.
>
> Thanks
>
> Phillip
>
> [1] https://lore.kernel.org/git/[email protected]/

Thanks for bringing up a very good point.

We often consider that a function that yields what we would normally
put in a size_t variable, when we _know_ that the return value would
not be so big to exceed half the range of size_t, can instead return
ssize_t and use the negative half of the range to signal error
conditions, but as the cited incident shows that it is an easy
mistake to make.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ezekiel Newren wrote (reply to this):

On Tue, Oct 21, 2025 at 11:18 AM Junio C Hamano <[email protected]> wrote:
>
> Phillip Wood <[email protected]> writes:
>
> > On 15/10/2025 22:18, Ezekiel Newren via GitGitGadget wrote:
> >> From: Ezekiel Newren <[email protected]>
> >>
> >> ssize_t is appropriate for dstart and dend because they both describe
> >> positive or negative offsets relative to a pointer.
> >
> > Isn't ptrdiff_t the appropriate type for an offset to a pointer? ssize_t
> > is not guaranteed to be the same width as size_t (this has caused
> > problems in the past[1]) and is only defined by POSIX, not the C standard.
> >
> > Thanks
> >
> > Phillip
> >
> > [1] https://lore.kernel.org/git/[email protected]/
>
> Thanks for bringing up a very good point.
>
> We often consider that a function that yields what we would normally
> put in a size_t variable, when we _know_ that the return value would
> not be so big to exceed half the range of size_t, can instead return
> ssize_t and use the negative half of the range to signal error
> conditions, but as the cited incident shows that it is an easy
> mistake to make.

In my compat/rust_types.h file (which was dropped) I defined isize
using ptrdiff_t rather than ssize_t. Maybe that file should be revived
so that we don't have confusion in code reviews when structs are being
expressly converted for the purpose of Rust FFI? I'd really like to
bring that file back so that everyone has a clear reference for how C
types map to Rust, but no one seemed to like it except me. Maybe it
should be an adoc file rather than a header?

[1] compat/rust_types.h
https://lore.kernel.org/git/2a7d5b05c18d4a96f1905b7043d47c62d367cd2a.1757274320.git.gitgitgadget@gmail.com/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Ezekiel Newren <[email protected]> writes:

> In my compat/rust_types.h file (which was dropped) I defined isize
> using ptrdiff_t rather than ssize_t. Maybe that file should be revived
> so that we don't have confusion in code reviews when structs are being
> expressly converted for the purpose of Rust FFI? I'd really like to
> bring that file back so that everyone has a clear reference for how C
> types map to Rust, but no one seemed to like it except me. Maybe it
> should be an adoc file rather than a header?

I may be mistaken, but I thought that the latest agreement was to
use conceptually the "same" type in each language, have each
language call that type in its native way, and if needed convert at
the FFI boundary.  So if we agree to use, for example, 64-bit signed
integer type for counting things plus returning error conditions via
negative values, maybe C-side can agree to use i64 for it, without
having to worry about how that thing is called in Rust side.

I am not sure in what way <compat/rust_types.h> should be used, and
perhaps a documentation file may be sufficient as you suggest, but
in any case, I agree that it should be made clear to everybody what
C-types are to be mapped to what Rust types and vice versa, and if
some C-types have no corresponding Rust type in that mapping, or if
some Rust types have no corresponding C-type, that type needs to be
converted before they reach the FFI boundary.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ezekiel Newren wrote (reply to this):

On Wed, Oct 22, 2025 at 3:38 PM Junio C Hamano <[email protected]> wrote:
>
> Ezekiel Newren <[email protected]> writes:
>
> > In my compat/rust_types.h file (which was dropped) I defined isize
> > using ptrdiff_t rather than ssize_t. Maybe that file should be revived
> > so that we don't have confusion in code reviews when structs are being
> > expressly converted for the purpose of Rust FFI? I'd really like to
> > bring that file back so that everyone has a clear reference for how C
> > types map to Rust, but no one seemed to like it except me. Maybe it
> > should be an adoc file rather than a header?
>
> I may be mistaken, but I thought that the latest agreement was to
> use conceptually the "same" type in each language, have each
> language call that type in its native way, and if needed convert at
> the FFI boundary.  So if we agree to use, for example, 64-bit signed
> integer type for counting things plus returning error conditions via
> negative values, maybe C-side can agree to use i64 for it, without
> having to worry about how that thing is called in Rust side.

Your understanding is correct. Would
Documentation/unambiguous_types.adoc be an appropriate place for this
documentation?

> I am not sure in what way <compat/rust_types.h> should be used, and
> perhaps a documentation file may be sufficient as you suggest, but
> in any case, I agree that it should be made clear to everybody what
> C-types are to be mapped to what Rust types and vice versa, and if
> some C-types have no corresponding Rust type in that mapping, or if
> some Rust types have no corresponding C-type, that type needs to be
> converted before they reach the FFI boundary.

Alright. I guess I'll drop the idea of compat/rust_types.h permanently.

static int get_indent(xrecord_t *rec)
{
long i;
int ret = 0;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Phillip Wood wrote (reply to this):

On 15/10/2025 22:18, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <[email protected]>
> > Rust uses u8 to refer to bytes in memory. Since xrecord_t.ptr is also
> referring to bytes in memory, rather than unicode code points, use
> uint8_t instead of char.

It C "char" never refers to a unicode code point so I don't follow the reasoning here. Isn't the reason you want to change from "char" to "uint8_t" to match rust? Given "char" and "uint8_t" are the same width why can't we use "char" in the C struct and "u8" in the rust struct as the two structs would still have the same layout?

I agree with Patrick's comments on this patch - it would be nice to know how you decided where to add casts. Given that rust is going to be optional for at least a year we should take care to leave the C code in good shape with a minimum number of casts.

Thanks

Phillip

> Signed-off-by: Ezekiel Newren <[email protected]>
> ---
>   xdiff/xdiffi.c    |  8 ++++----
>   xdiff/xemit.c     |  6 +++---
>   xdiff/xmerge.c    | 14 +++++++-------
>   xdiff/xpatience.c |  2 +-
>   xdiff/xprepare.c  |  8 ++++----
>   xdiff/xtypes.h    |  2 +-
>   xdiff/xutils.c    |  4 ++--
>   7 files changed, 22 insertions(+), 22 deletions(-)
> > diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
> index 6f3998ee54..411a8aa69f 100644
> --- a/xdiff/xdiffi.c
> +++ b/xdiff/xdiffi.c
> @@ -407,7 +407,7 @@ static int get_indent(xrecord_t *rec)
>   	int ret = 0;
>   >   	for (i = 0; i < rec->size; i++) {
> -		char c = rec->ptr[i];
> +		uint8_t c = rec->ptr[i];
>   >   		if (!XDL_ISSPACE(c))
>   			return ret;
> @@ -993,11 +993,11 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags)
>   >   		rec = &xe->xdf1.recs[xch->i1];
>   		for (i = 0; i < xch->chg1 && ignore; i++)
> -			ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
> +			ignore = xdl_blankline((const char *)rec[i].ptr, rec[i].size, flags);
>   >   		rec = &xe->xdf2.recs[xch->i2];
>   		for (i = 0; i < xch->chg2 && ignore; i++)
> -			ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
> +			ignore = xdl_blankline((const char *)rec[i].ptr, rec[i].size, flags);
>   >   		xch->ignore = ignore;
>   	}
> @@ -1008,7 +1008,7 @@ static int record_matches_regex(xrecord_t *rec, xpparam_t const *xpp) {
>   	size_t i;
>   >   	for (i = 0; i < xpp->ignore_regex_nr; i++)
> -		if (!regexec_buf(xpp->ignore_regex[i], rec->ptr, rec->size, 1,
> +		if (!regexec_buf(xpp->ignore_regex[i], (const char *)rec->ptr, rec->size, 1,
>   				 &regmatch, 0))
>   			return 1;
>   > diff --git a/xdiff/xemit.c b/xdiff/xemit.c
> index b2f1f30cd3..ead930088a 100644
> --- a/xdiff/xemit.c
> +++ b/xdiff/xemit.c
> @@ -27,7 +27,7 @@ static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *
>   {
>   	xrecord_t *rec = &xdf->recs[ri];
>   > -	if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
> +	if (xdl_emit_diffrec((char const *)rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
>   		return -1;
>   >   	return 0;
> @@ -113,8 +113,8 @@ static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
>   	xrecord_t *rec = &xdf->recs[ri];
>   >   	if (!xecfg->find_func)
> -		return def_ff(rec->ptr, rec->size, buf, sz);
> -	return xecfg->find_func(rec->ptr, rec->size, buf, sz, xecfg->find_func_priv);
> +		return def_ff((const char *)rec->ptr, rec->size, buf, sz);
> +	return xecfg->find_func((const char *)rec->ptr, rec->size, buf, sz, xecfg->find_func_priv);
>   }
>   >   static int is_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri)
> diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
> index fd600cbb5d..75cb3e76a2 100644
> --- a/xdiff/xmerge.c
> +++ b/xdiff/xmerge.c
> @@ -101,8 +101,8 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
>   	xrecord_t *rec2 = xe2->xdf2.recs + i2;
>   >   	for (i = 0; i < line_count; i++) {
> -		int result = xdl_recmatch(rec1[i].ptr, rec1[i].size,
> -			rec2[i].ptr, rec2[i].size, flags);
> +		int result = xdl_recmatch((const char *)rec1[i].ptr, rec1[i].size,
> +			(const char *)rec2[i].ptr, rec2[i].size, flags);
>   		if (!result)
>   			return -1;
>   	}
> @@ -324,8 +324,8 @@ static int xdl_fill_merge_buffer(xdfenv_t *xe1, const char *name1,
>   >   static int recmatch(xrecord_t *rec1, xrecord_t *rec2, unsigned long flags)
>   {
> -	return xdl_recmatch(rec1->ptr, rec1->size,
> -			    rec2->ptr, rec2->size, flags);
> +	return xdl_recmatch((const char *)rec1->ptr, rec1->size,
> +			    (const char *)rec2->ptr, rec2->size, flags);
>   }
>   >   /*
> @@ -382,10 +382,10 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
>   		 * we have a very simple mmfile structure.
>   		 */
>   		t1.ptr = (char *)xe1->xdf2.recs[m->i1].ptr;
> -		t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1].ptr
> +		t1.size = (char *)xe1->xdf2.recs[m->i1 + m->chg1 - 1].ptr
>   			+ xe1->xdf2.recs[m->i1 + m->chg1 - 1].size - t1.ptr;
>   		t2.ptr = (char *)xe2->xdf2.recs[m->i2].ptr;
> -		t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1].ptr
> +		t2.size = (char *)xe2->xdf2.recs[m->i2 + m->chg2 - 1].ptr
>   			+ xe2->xdf2.recs[m->i2 + m->chg2 - 1].size - t2.ptr;
>   		if (xdl_do_diff(&t1, &t2, xpp, &xe) < 0)
>   			return -1;
> @@ -440,7 +440,7 @@ static int line_contains_alnum(const char *ptr, long size)
>   static int lines_contain_alnum(xdfenv_t *xe, int i, int chg)
>   {
>   	for (; chg; chg--, i++)
> -		if (line_contains_alnum(xe->xdf2.recs[i].ptr,
> +		if (line_contains_alnum((const char *)xe->xdf2.recs[i].ptr,
>   				xe->xdf2.recs[i].size))
>   			return 1;
>   	return 0;
> diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
> index 669b653580..bb61354f22 100644
> --- a/xdiff/xpatience.c
> +++ b/xdiff/xpatience.c
> @@ -121,7 +121,7 @@ static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
>   		return;
>   	map->entries[index].line1 = line;
>   	map->entries[index].hash = record->ha;
> -	map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1].ptr);
> +	map->entries[index].anchor = is_anchor(xpp, (const char *)map->env->xdf1.recs[line - 1].ptr);
>   	if (!map->first)
>   		map->first = map->entries + index;
>   	if (map->last) {
> diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
> index 192334f1b7..4cb18b2b88 100644
> --- a/xdiff/xprepare.c
> +++ b/xdiff/xprepare.c
> @@ -99,8 +99,8 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
>   	hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
>   	for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
>   		if (rcrec->rec.ha == rec->ha &&
> -				xdl_recmatch(rcrec->rec.ptr, rcrec->rec.size,
> -					rec->ptr, rec->size, cf->flags))
> +				xdl_recmatch((const char *)rcrec->rec.ptr, rcrec->rec.size,
> +					(const char *)rec->ptr, rec->size, cf->flags))
>   			break;
>   >   	if (!rcrec) {
> @@ -156,8 +156,8 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
>   			if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
>   				goto abort;
>   			crec = &xdf->recs[xdf->nrec++];
> -			crec->ptr = prev;
> -			crec->size = (long) (cur - prev);
> +			crec->ptr = (uint8_t const *)prev;
> +			crec->size =(long) ( cur - prev);
>   			crec->ha = hav;
>   			if (xdl_classify_record(pass, cf, crec) < 0)
>   				goto abort;
> diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
> index 3514bb1684..57983627f5 100644
> --- a/xdiff/xtypes.h
> +++ b/xdiff/xtypes.h
> @@ -39,7 +39,7 @@ typedef struct s_chastore {
>   } chastore_t;
>   >   typedef struct s_xrecord {
> -	char const *ptr;
> +	uint8_t const *ptr;
>   	long size;
>   	unsigned long ha;
>   } xrecord_t;
> diff --git a/xdiff/xutils.c b/xdiff/xutils.c
> index 447e66c719..7be063bfb6 100644
> --- a/xdiff/xutils.c
> +++ b/xdiff/xutils.c
> @@ -465,10 +465,10 @@ int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
>   	xdfenv_t env;
>   >   	subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1].ptr;
> -	subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2].ptr +
> +	subfile1.size = (char *)diff_env->xdf1.recs[line1 + count1 - 2].ptr +
>   		diff_env->xdf1.recs[line1 + count1 - 2].size - subfile1.ptr;
>   	subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1].ptr;
> -	subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2].ptr +
> +	subfile2.size = (char *)diff_env->xdf2.recs[line2 + count2 - 2].ptr +
>   		diff_env->xdf2.recs[line2 + count2 - 2].size - subfile2.ptr;
>   	if (xdl_do_diff(&subfile1, &subfile2, xpp, &env) < 0)
>   		return -1;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Phillip Wood <[email protected]> writes:

> It C "char" never refers to a unicode code point so I don't follow the 
> reasoning here. Isn't the reason you want to change from "char" to 
> "uint8_t" to match rust? Given "char" and "uint8_t" are the same width 
> why can't we use "char" in the C struct and "u8" in the rust struct as 
> the two structs would still have the same layout?

And forcing u8 makes sure both sides of the ffi agrees on the
signedness (C "char"'s signedness is implementation defined),
which is a good thing.

I 100% agree that being honest about the motivation to sell this
change would be a good thing to do here.  I do not think "in this
series, I want to match the types used at the interface to be of
Rust's" is a position to be ashamed of ;-)

> I agree with Patrick's comments on this patch - it would be nice to know 
> how you decided where to add casts. Given that rust is going to be 
> optional for at least a year we should take care to leave the C code in 
> good shape with a minimum number of casts.

Thanks.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Phillip Wood wrote (reply to this):

On 21/10/2025 19:15, Junio C Hamano wrote:
> Phillip Wood <[email protected]> writes:
> >> It C "char" never refers to a unicode code point so I don't follow the
>> reasoning here. Isn't the reason you want to change from "char" to
>> "uint8_t" to match rust? Given "char" and "uint8_t" are the same width
>> why can't we use "char" in the C struct and "u8" in the rust struct as
>> the two structs would still have the same layout?
> > And forcing u8 makes sure both sides of the ffi agrees on the
> signedness (C "char"'s signedness is implementation defined),
> which is a good thing.

That's true and ignoring the signedness would be hacky but I'm not sure it matters in practice. Both C and rust would use the same bit patterns for "abc" and b"abc\0" and in general C plays fast and loose with the signedness of variables all over the place. The trade off for respecting the signedness is that we either have casts all over the place or massive churn converting the rest of the code to use uint8_t. This problem isn't limited to xdiff, it will be true wherever we share bytestrings such as the contents of objects between C and rust as we tend to use char rather than uint8_t in our code.

Thanks

Phillip

> I 100% agree that being honest about the motivation to sell this
> change would be a good thing to do here.  I do not think "in this
> series, I want to match the types used at the interface to be of
> Rust's" is a position to be ashamed of ;-)
> >> I agree with Patrick's comments on this patch - it would be nice to know
>> how you decided where to add casts. Given that rust is going to be
>> optional for at least a year we should take care to leave the C code in
>> good shape with a minimum number of casts.
> > Thanks.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ezekiel Newren wrote (reply to this):

On Wed, Oct 22, 2025 at 7:27 AM Phillip Wood <[email protected]> wrote:
> > I 100% agree that being honest about the motivation to sell this
> > change would be a good thing to do here.  I do not think "in this
> > series, I want to match the types used at the interface to be of
> > Rust's" is a position to be ashamed of ;-)
> >
> >> I agree with Patrick's comments on this patch - it would be nice to know
> >> how you decided where to add casts. Given that rust is going to be
> >> optional for at least a year we should take care to leave the C code in
> >> good shape with a minimum number of casts.
> >
> > Thanks.

I'm not arguing that uint8_t should be used everywhere in Git, only
that it is used everywhere in xdiff. xrecord_t and xdfile_t are
fundamental to how xdiff passes data around and they need to be
transparent to both sides. I'm trying to leave the rest of the data
structures alone in order to avoid refactor churn. Refactoring C to
use unambiguous types, outside of xdiff, is outside the scope of this
patch series.

Another problem with using char instead of uint8_t is that tools like
cbindgen and bindgen don't translate char to u8. Bindgen will see char
and will produce std::ffi::c_char on the Rust side, see [1] for why
that's a problem. The other way around is a problem too. When cbindgen
sees u8 it will generate uint8_t on the C side and then `make
DEVELOPER=1` won't compile because uint8_t and char differer in
signedness.

[1] Problems with C types
https://lore.kernel.org/git/CAH=[email protected]/

@gitgitgadget-git
Copy link

On the Git mailing list, Phillip Wood wrote (reply to this):

Hi Ezekiel

On 15/10/2025 22:18, Ezekiel Newren via GitGitGadget wrote:
> Maintainer note: This patch series builds on top of en/xdiff-cleanup and
> am/xdiff-hash-tweak (both of which are now in master).
> > The primary goal of this patch series is to convert every field's type in
> xrecord_t and xdfile_t to be unambiguous, in preparation to make it more
> Rust FFI friendly. Additionally the ha field in xrecord_t is split into
> line_hash and minimal_perfect hash.

Given that this series changes the types of all the "long" struct members to "size_t" I was surprised to see that it adds so many "(long)" casts. At the end of this series there are 38 lines in xdiff/ that contain "(long)" compared to just 4 in master. I had expected that as we'd converted all the members to "size_t" there would be no need to keep using "long" in the code. As rust is going to be optional for quite a while I think we should clean up the C code to avoid casting between "long" and "size_t"

Thanks

Phillip

> The order of some of the fields has changed as called out by the commit
> messages.
> > Before:
> > typedef struct s_xrecord {
> 	char const *ptr;
> 	long size;
> 	unsigned long ha;
> } xrecord_t;
> > typedef struct s_xdfile {
> 	xrecord_t *recs;
> 	long nrec;
> 	long dstart, dend;
> 	bool *changed;
> 	long *rindex;
> 	long nreff;
> } xdfile_t;
> > > After part 2
> > typedef struct s_xrecord {
> 	uint8_t const *ptr;
> 	size_t size;
> 	uint64_t line_hash;
> 	size_t minimal_perfect_hash;
> } xrecord_t;
> > typedef struct s_xdfile {
> 	xrecord_t *recs;
> 	size_t nrec;
> 	bool *changed;
> 	size_t *reference_index;
> 	size_t nreff;
> 	ssize_t dstart, dend;
> } xdfile_t;
> > > Ezekiel Newren (9):
>    xdiff: use ssize_t for dstart/dend, make them last in xdfile_t
>    xdiff: make xrecord_t.ptr a uint8_t instead of char
>    xdiff: use size_t for xrecord_t.size
>    xdiff: use unambiguous types in xdl_hash_record()
>    xdiff: split xrecord_t.ha into line_hash and minimal_perfect_hash
>    xdiff: make xdfile_t.nrec a size_t instead of long
>    xdiff: make xdfile_t.nreff a size_t instead of long
>    xdiff: change rindex from long to size_t in xdfile_t
>    xdiff: rename rindex -> reference_index
> >   xdiff-interface.c  |  2 +-
>   xdiff/xdiffi.c     | 29 +++++++++++------------
>   xdiff/xemit.c      | 28 +++++++++++-----------
>   xdiff/xhistogram.c |  4 ++--
>   xdiff/xmerge.c     | 30 ++++++++++++------------
>   xdiff/xpatience.c  | 14 +++++------
>   xdiff/xprepare.c   | 58 +++++++++++++++++++++++-----------------------
>   xdiff/xtypes.h     | 15 ++++++------
>   xdiff/xutils.c     | 32 ++++++++++++-------------
>   xdiff/xutils.h     |  6 ++---
>   10 files changed, 109 insertions(+), 109 deletions(-)
> > > base-commit: 143f58ef7535f8f8a80d810768a18bdf3807de26
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2070%2Fezekielnewren%2Fxdiff_cleanup_part2-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2070/ezekielnewren/xdiff_cleanup_part2-v1
> Pull-Request: https://github.com/git/git/pull/2070

@gitgitgadget-git
Copy link

On the Git mailing list, Junio C Hamano wrote (reply to this):

Phillip Wood <[email protected]> writes:

> Given that this series changes the types of all the "long" struct 
> members to "size_t" I was surprised to see that it adds so many "(long)" 
> casts. At the end of this series there are 38 lines in xdiff/ that 
> contain "(long)" compared to just 4 in master. I had expected that as 
> we'd converted all the members to "size_t" there would be no need to 
> keep using "long" in the code. As rust is going to be optional for quite 
> a while I think we should clean up the C code to avoid casting between 
> "long" and "size_t"

Either we cast here or have existing code that used to use long to
use another type, that needs to be done carefully as we would be
moving code that used signed type to now use unsigned.  While I
agree with you in principle that we shouldn't try to interface
between code pieces with impedance mismatch (for which the need to
cast is an indication), we'd need to draw a line somewhere.

Thanks.

@gitgitgadget-git
Copy link

This patch series was integrated into seen via 73401b7.

size_t is the appropriate type because size is describing the number of
elements, bytes in this case, in memory.

Signed-off-by: Ezekiel Newren <[email protected]>
Convert the function signature and body to use unambiguous types. char
is changed to uint8_t because this function processes bytes in memory.
unsigned long to uint64_t so that the hash output is consistent across
platforms. `flags` was changed from long to uint64_t to ensure the
high order bits are not dropped on platforms that treat long as 32
bits.

Signed-off-by: Ezekiel Newren <[email protected]>
The ha field is serving two different purposes, which makes the code
harder to read. At first glance it looks like many places assume
there could never be hash collisions between lines of the two input
files. In reality, line_hash is used together with xdl_recmatch() to
ensure correct comparisons of lines, even when collisions occur.

To make this clearer, the old ha field has been split:
  * line_hash: The straightforward hash of a line, requiring no
    additional context.
  * minimal_perfect_hash: Not a new concept, but now a separate
    field. It comes from the classifier's general-purpose hash table,
    which assigns each line a unique and minimal hash across the two
    files.

Signed-off-by: Ezekiel Newren <[email protected]>
size_t is used because nrec describes the number of elements in memory
for recs, and the number of elements in memory for 'changed' + 2.

Signed-off-by: Ezekiel Newren <[email protected]>
size_t is used because nreff describes the number of elements in memory
for rindex.

Signed-off-by: Ezekiel Newren <[email protected]>
rindex describes a index offset which means it's an index into memory
which should use size_t.

Changing the type of rindex from long to size_t has no cascading
refactor impact because it is only ever used to directly index other
arrays.

Signed-off-by: Ezekiel Newren <[email protected]>
The classic diff adds only the lines that it's going to consider,
during the diff, to an array. A mapping between the compacted
array, and the lines of the file that they reference, are
facilitated by this array.

Signed-off-by: Ezekiel Newren <[email protected]>
@ezekielnewren
Copy link
Contributor Author

/submit

@gitgitgadget-git
Copy link

Submitted as [email protected]

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-git-2070/ezekielnewren/xdiff_cleanup_part2-v2

To fetch this version to local tag pr-git-2070/ezekielnewren/xdiff_cleanup_part2-v2:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-git-2070/ezekielnewren/xdiff_cleanup_part2-v2

@gitgitgadget-git
Copy link

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Ezekiel Newren via GitGitGadget" <[email protected]> writes:

>  * Added documentation about unambiguous types and FFI

Nicely written; a few footnote entries may be a bit too strict,
misleading, and may need rephrasing, though.  For example, we may
want to be suspicious when we see code that uses ssize_t as if it is
half the size_t plus error indication, it does not immediately mean
that the type "should not be used in Git". It is perfectly sensible
to assign to or compare with returned value from write(2), for
example.

Will queue.  Thanks.

@gitgitgadget-git
Copy link

This patch series was integrated into seen via 48cf209.

@gitgitgadget-git
Copy link

There was a status update in the "Cooking" section about the branch en/xdiff-cleanup-2 on the Git mailing list:

Code clean-up.

Comments?
source: <[email protected]>

@gitgitgadget-git
Copy link

This patch series was integrated into seen via f9cf8e5.

@gitgitgadget-git
Copy link

This patch series was integrated into seen via 837f44c.

@gitgitgadget-git
Copy link

This patch series was integrated into seen via b698e7b.

@gitgitgadget-git
Copy link

There was a status update in the "Cooking" section about the branch en/xdiff-cleanup-2 on the Git mailing list:

Code clean-up.

Comments?
source: <[email protected]>

@gitgitgadget-git
Copy link

This patch series was integrated into seen via 3b3249a.

@gitgitgadget-git
Copy link

This patch series was integrated into seen via ea5f1e5.

@gitgitgadget-git
Copy link

This patch series was integrated into seen via 1308302.

@@ -0,0 +1,229 @@
= Unambiguous types

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Phillip Wood wrote (reply to this):

Hi Ezekiel

On 29/10/2025 22:19, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <[email protected]>
> > Document other nuances with crossing the FFI boundary. Other language
> mappings may be added in the future.

Thanks for adding this, I've left a few comments below. Overall I thought it was very well written. I tried building an html version of this but even after adding it to the list of TECH_DOCS in Documentation/Makefile with

diff --git a/Documentation/Makefile b/Documentation/Makefile
index 47208269a2e..2699f0b24af 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -143,6 +143,7 @@ TECH_DOCS += technical/shallow
 TECH_DOCS += technical/sparse-checkout
 TECH_DOCS += technical/sparse-index
 TECH_DOCS += technical/trivial-merge
+TECH_DOCS += technical/unambiguous-types
 TECH_DOCS += technical/unit-tests
 SP_ARTICLES += $(TECH_DOCS)
 SP_ARTICLES += technical/api-index

it fails with

$ make -C Documentation/ technical/unambiguous-types.html                                        Merge branch 'ps/object-source-loose' into seen
make: Entering directory '/home/phil/src/git/Documentation'
    GEN asciidoc.conf
    * new asciidoc flags
    ASCIIDOC technical/unambiguous-types.html
asciidoc: ERROR: unambiguous-types.adoc: line 139: undefined filter attribute in command: source-highlight --gen-version -f xhtml -s {language} {src_numbered?--line-number=' '} {src_tab?--tab={src_tab}} {args=}
asciidoc: ERROR: unambiguous-types.adoc: line 162: undefined filter attribute in command: source-highlight --gen-version -f xhtml -s {language} {src_numbered?--line-number=' '} {src_tab?--tab={src_tab}} {args=}
asciidoc: ERROR: unambiguous-types.adoc: line 177: undefined filter attribute in command: source-highlight --gen-version -f xhtml -s {language} {src_numbered?--line-number=' '} {src_tab?--tab={src_tab}} {args=}
asciidoc: ERROR: unambiguous-types.adoc: line 187: undefined filter attribute in command: source-highlight --gen-version -f xhtml -s {language} {src_numbered?--line-number=' '} {src_tab?--tab={src_tab}} {args=}
asciidoc: ERROR: unambiguous-types.adoc: line 199: undefined filter attribute in command: source-highlight --gen-version -f xhtml -s {language} {src_numbered?--line-number=' '} {src_tab?--tab={src_tab}} {args=}
asciidoc: ERROR: unambiguous-types.adoc: line 213: undefined filter attribute in command: source-highlight --gen-version -f xhtml -s {language} {src_numbered?--line-number=' '} {src_tab?--tab={src_tab}} {args=}
asciidoc: ERROR: unambiguous-types.adoc: line 224: undefined filter attribute in command: source-highlight --gen-version -f xhtml -s {language} {src_numbered?--line-number=' '} {src_tab?--tab={src_tab}} {args=}
make: *** [Makefile:396: technical/unambiguous-types.html] Error 1
make: *** Deleting file 'technical/unambiguous-types.html'
make: Leaving directory '/home/phil/src/git/Documentation'

> +== Character types
> +
> +This is where C and Rust don't have a clean one-to-one mapping. A C `char` is
> +an 8-bit type that is signless (neither signed nor unsigned) I found this a bit confusing. Isn't the signedness of "char" implementation defined rather than it being "signless"

> which causes
> +problems with e.g. `make DEVELOPER=1`.

I'm not sure what this is referring to - maybe -Wsign-compare?

> Rust's `char` type is an unsigned 32-bit
> +integer that is used to describe Unicode code points. Even though a C `char`
> +is the same width as `u8`, `char` should be converted to u8 where it is
> +describing bytes in memory. I'm dreading the point where we start sharing "struct strbuf" with rust and have to change the "buf" member from "char*" to "uint8_t*". While it is not used in the xdiff code it is ubiquitous everywhere else and there are lots of places where be pass the "buf" member to functions expecting a "char*".

	git grep -E '(\.|->)buf\W'

has over 4000 matches

> If a C `char` is not describing bytes, then it
> +should be converted to a more accurate unambiguous type.

That's a good point.

> +While you could specify `char` in the C code and `u8` in Rust code, it's not as
> +clear what the appropriate type is, but it would work across the FFI boundary.
> +However the bigger problem comes from code generation tools like cbindgen and
> +bindgen. When cbindgen see u8 in Rust it will generate uint8_t on the C side
> +which will cause differ in signedness warnings/errors. Similaraly if bindgen
> +see `char` on the C side it will generate `std::ffi::c_char` which has its own
> +problems.

Yeah, we definitely don't want to be using "std::ffi::c_char" in our rust implementations. I do wonder if we might want to use it (or CStr) judiciously in function parameters and immediately convert it to u8 in the function body where the function is called from C though.

> +=== Notes
> +^1^ This is only true if stdbool.h (or equivalent) is used. +
> +^2^ C does not enforce IEEE-754 compatibility, but Rust expects it. If the
> +platform/arch for C does not follow IEEE-754 then this equivalence does not
> +hold. Also, it's assumed that `float` is 32 bits and `double` is 64, but
> +there may be a strange platform/arch where even this isn't true. +
> +^3^ C also defines uintptr_t, but this should not be used in Git. +
> +^4^ C also defines ssize_t and intptr_t, but these should not be used in Git. +

[u]intptr_t and ssize_t are used in git already. As Junio has pointed out there are sane uses for these types but we don't want to use them in structs or function parameters where the struct or function is shared with rust.

> +
> +== Problems with std::ffi::c_* types in Rust
> +TL;DR: They're not guaranteed to match C types for all possible C
> +compilers/platforms/architectures.

Is this official policy of the rust project?

Thanks

Phillip

> +Only a few of Rust's C FFI types are considered safe and semantically clear to
> +use: +
> +
> +* `c_void`
> +* `CStr`
> +* `CString`
> +
> +Even then, they should be used sparingly, and only where the semantics match
> +exactly.
> +
> +The std::os::raw::c_* (which is deprecated) directly inherits the problems of
> +core::ffi, which changes over time and seems to make a best guess at the
> +correct definition for a given platform/target. This probably isn't a problem
> +for all platforms that Rust supports currently, but can anyone say that Rust
> +got it right for all C compilers of all platforms/targets?
> +
> +On top of all of that we're targeting an older version of Rust which doesn't
> +have the latest mappings.
> +
> +To give an example: c_long is defined in
> +footnote:[https://doc.rust-lang.org/1.63.0/src/core/ffi/mod.rs.html#175-189[c_long in 1.63.0]]
> +footnote:[https://doc.rust-lang.org/1.89.0/src/core/ffi/primitives.rs.html#135-151[c_long in 1.89.0]]
> +
> +=== Rust version 1.63.0
> +
> +[source]
> +----
> +mod c_long_definition {
> +    cfg_if! {
> +        if #[cfg(all(target_pointer_width = "64", not(windows)))] {
> +            pub type c_long = i64;
> +            pub type NonZero_c_long = crate::num::NonZeroI64;
> +            pub type c_ulong = u64;
> +            pub type NonZero_c_ulong = crate::num::NonZeroU64;
> +        } else {
> +            // The minimal size of `long` in the C standard is 32 bits
> +            pub type c_long = i32;
> +            pub type NonZero_c_long = crate::num::NonZeroI32;
> +            pub type c_ulong = u32;
> +            pub type NonZero_c_ulong = crate::num::NonZeroU32;
> +        }
> +    }
> +}
> +----
> +
> +=== Rust version 1.89.0
> +
> +[source]
> +----
> +mod c_long_definition {
> +    crate::cfg_select! {
> +        any(
> +            all(target_pointer_width = "64", not(windows)),
> +            // wasm32 Linux ABI uses 64-bit long
> +            all(target_arch = "wasm32", target_os = "linux")
> +        ) => {
> +            pub(super) type c_long = i64;
> +            pub(super) type c_ulong = u64;
> +        }
> +        _ => {
> +            // The minimal size of `long` in the C standard is 32 bits
> +            pub(super) type c_long = i32;
> +            pub(super) type c_ulong = u32;
> +        }
> +    }
> +}
> +----
> +
> +Even for the cases where C types are correctly mapped to Rust types via
> +std::ffi::c_* there are still problems. Let's take c_char for example. On some
> +platforms it's u8 on others it's i8.
> +
> +=== Subtraction underflow in debug mode
> +
> +The following code will panic in debug on platforms that define c_char as u8,
> +but won't if it's an i8.
> +
> +[source]
> +----
> +let mut x: std::ffi::c_char = 0;
> +x -= 1;
> +----
> +
> +=== Inconsistent shift behavior
> +
> +`x` will be 0xC0 for platforms that use i8, but will be 0x40 where it's u8.
> +
> +[source]
> +----
> +let mut x: std::ffi::c_char = 0x80;
> +x >>= 1;
> +----
> +
> +=== Equality fails to compile on some platforms
> +
> +The following will not compile on platforms that define c_char as i8, but will
> +if it's u8. You can cast x e.g. `assert_eq!(x as u8, b'a');`, but then you get
> +a warning on platforms that use u8 and a clean compilation where i8 is used.
> +
> +[source]
> +----
> +let mut x: std::ffi::c_char = 0x61;
> +assert_eq!(x, b'a');
> +----
> +
> +== Enum types
> +Rust enum types should not be used as FFI types. Rust enum types are more like
> +C union types than C enum's. For something like:
> +
> +[source]
> +----
> +#[repr(C, u8)]
> +enum Fruit {
> +    Apple,
> +    Banana,
> +    Cherry,
> +}
> +----
> +
> +It's easy enough to make sure the Rust enum matches what C would expect, but a
> +more complex type like.
> +
> +[source]
> +----
> +enum HashResult {
> +    SHA1([u8; 20]),
> +    SHA256([u8; 32]),
> +}
> +----
> +
> +The Rust compiler has to add a discriminant to the enum to distinguish between
> +the variants. The width, location, and values for that discriminant is up to
> +the Rust compiler and is not ABI stable.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ezekiel Newren wrote (reply to this):

On Thu, Nov 6, 2025 at 2:55 AM Phillip Wood <[email protected]> wrote:
>
> Hi Ezekiel
>
> On 29/10/2025 22:19, Ezekiel Newren via GitGitGadget wrote:
> > From: Ezekiel Newren <[email protected]>
> >
> > Document other nuances with crossing the FFI boundary. Other language
> > mappings may be added in the future.
>
> Thanks for adding this, I've left a few comments below. Overall I
> thought it was very well written.

Thanks.

I felt it was necessary since C vs Rust types keep coming up over and
over again. I'm flexible with the wording of this document. I was just
trying to convey a firm and clear stance on what is and isn't proper
in Git.

> I tried building an html version of
> this but even after adding it to the list of TECH_DOCS in
> Documentation/Makefile with
>
> diff --git a/Documentation/Makefile b/Documentation/Makefile
> index 47208269a2e..2699f0b24af 100644
> --- a/Documentation/Makefile
> +++ b/Documentation/Makefile
> @@ -143,6 +143,7 @@ TECH_DOCS += technical/shallow
>   TECH_DOCS += technical/sparse-checkout
>   TECH_DOCS += technical/sparse-index
>   TECH_DOCS += technical/trivial-merge
> +TECH_DOCS += technical/unambiguous-types
>   TECH_DOCS += technical/unit-tests
>   SP_ARTICLES += $(TECH_DOCS)
>   SP_ARTICLES += technical/api-index
>
> it fails with
>
> $ make -C Documentation/ technical/unambiguous-types.html
>                                        Merge branch
> 'ps/object-source-loose' into seen
> make: Entering directory '/home/phil/src/git/Documentation'
>      GEN asciidoc.conf
>      * new asciidoc flags
>      ASCIIDOC technical/unambiguous-types.html
> asciidoc: ERROR: unambiguous-types.adoc: line 139: undefined filter
> attribute in command: source-highlight --gen-version -f xhtml -s
> {language} {src_numbered?--line-number=' '} {src_tab?--tab={src_tab}}
> {args=}
> asciidoc: ERROR: unambiguous-types.adoc: line 162: undefined filter
> attribute in command: source-highlight --gen-version -f xhtml -s
> {language} {src_numbered?--line-number=' '} {src_tab?--tab={src_tab}}
> {args=}
> asciidoc: ERROR: unambiguous-types.adoc: line 177: undefined filter
> attribute in command: source-highlight --gen-version -f xhtml -s
> {language} {src_numbered?--line-number=' '} {src_tab?--tab={src_tab}}
> {args=}
> asciidoc: ERROR: unambiguous-types.adoc: line 187: undefined filter
> attribute in command: source-highlight --gen-version -f xhtml -s
> {language} {src_numbered?--line-number=' '} {src_tab?--tab={src_tab}}
> {args=}
> asciidoc: ERROR: unambiguous-types.adoc: line 199: undefined filter
> attribute in command: source-highlight --gen-version -f xhtml -s
> {language} {src_numbered?--line-number=' '} {src_tab?--tab={src_tab}}
> {args=}
> asciidoc: ERROR: unambiguous-types.adoc: line 213: undefined filter
> attribute in command: source-highlight --gen-version -f xhtml -s
> {language} {src_numbered?--line-number=' '} {src_tab?--tab={src_tab}}
> {args=}
> asciidoc: ERROR: unambiguous-types.adoc: line 224: undefined filter
> attribute in command: source-highlight --gen-version -f xhtml -s
> {language} {src_numbered?--line-number=' '} {src_tab?--tab={src_tab}}
> {args=}
> make: *** [Makefile:396: technical/unambiguous-types.html] Error 1
> make: *** Deleting file 'technical/unambiguous-types.html'
> make: Leaving directory '/home/phil/src/git/Documentation'

I've never created documentation for Git before, so this helps. I'll
incorporate your suggestions.

> > +== Character types
> > +
> > +This is where C and Rust don't have a clean one-to-one mapping. A C `char` is
> > +an 8-bit type that is signless (neither signed nor unsigned)
>
> I found this a bit confusing. Isn't the signedness of "char"
> implementation defined rather than it being "signless"
>
> > which causes
> > +problems with e.g. `make DEVELOPER=1`.
>
> I'm not sure what this is referring to - maybe -Wsign-compare?

When I build Git with `make DEVELOPER=1` and I compare uint8_t with
char it complains about a difference in signedness. When I compare
int8_t with char it also complains about a difference in signedness.
So it is implementation defined, but it's also neither signed nor
unsigned according to DEVELOPER=1 since it complains either way.

> > Rust's `char` type is an unsigned 32-bit
> > +integer that is used to describe Unicode code points. Even though a C `char`
> > +is the same width as `u8`, `char` should be converted to u8 where it is
> > +describing bytes in memory.
>
> I'm dreading the point where we start sharing "struct strbuf" with rust
> and have to change the "buf" member from "char*" to "uint8_t*". While it
> is not used in the xdiff code it is ubiquitous everywhere else and there
> are lots of places where be pass the "buf" member to functions expecting
> a "char*".
>
>         git grep -E '(\.|->)buf\W'
>
> has over 4000 matches

This is why I started in Xdiff since its code is mostly isolated. I
think that we might have to bite the bullet and deal with the ugly
mapping of char on the C side and u8 on the Rust side when dealing
with strbuf. Maybe as we translate more of C into Rust someone will
have a better suggestion. I think my ivec type would be better since
strbuf is almost a special case of my ivec type, but dealing with
strbuf is outside the scope of this patch series.

> > If a C `char` is not describing bytes, then it
> > +should be converted to a more accurate unambiguous type.
>
> That's a good point.
>
> > +While you could specify `char` in the C code and `u8` in Rust code, it's not as
> > +clear what the appropriate type is, but it would work across the FFI boundary.
> > +However the bigger problem comes from code generation tools like cbindgen and
> > +bindgen. When cbindgen see u8 in Rust it will generate uint8_t on the C side
> > +which will cause differ in signedness warnings/errors. Similarly if bindgen
> > +see `char` on the C side it will generate `std::ffi::c_char` which has its own
> > +problems.
>
> Yeah, we definitely don't want to be using "std::ffi::c_char" in our
> rust implementations. I do wonder if we might want to use it (or CStr)
> judiciously in function parameters and immediately convert it to u8 in
> the function body where the function is called from C though.

That's basically the design pattern I've been using.

In many of my translations from C to Rust I create a Rust stub
function that takes pointer types and wraps them into safe types which
then get handed off to a safe Rust function. I think that in the cases
where CString/CStr is required the Rust stub function would create a
&[u8] slice for the safe function to operate on.

> > +=== Notes
> > +^1^ This is only true if stdbool.h (or equivalent) is used. +
> > +^2^ C does not enforce IEEE-754 compatibility, but Rust expects it. If the
> > +platform/arch for C does not follow IEEE-754 then this equivalence does not
> > +hold. Also, it's assumed that `float` is 32 bits and `double` is 64, but
> > +there may be a strange platform/arch where even this isn't true. +
> > +^3^ C also defines uintptr_t, but this should not be used in Git. +
> > +^4^ C also defines ssize_t and intptr_t, but these should not be used in Git. +
>
> [u]intptr_t and ssize_t are used in git already. As Junio has pointed
> out there are sane uses for these types but we don't want to use them in
> structs or function parameters where the struct or function is shared
> with rust.

You're right, I should update the phrasing. Something like: "These
types shouldn't be used if their explicit purpose is for FFI. Whether
as a field in a struct or part of a function signature." I'll update
the wording.

> > +
> > +== Problems with std::ffi::c_* types in Rust
> > +TL;DR: They're not guaranteed to match C types for all possible C
> > +compilers/platforms/architectures.
>
> Is this official policy of the rust project?

No, this is a personal inference based on logical deduction. The c_*
definitions have changed over time with new Rust version releases, and
Git targets more platforms/architectures than what Rust officially
supports. While it's not guaranteed that it won't work everywhere.
It's also not guaranteed to work everywhere either. On top of that
we're targeting 1.63.0 who's c_* definitions are different in 1.89.0
which I show an example of with c_long_definition. Can anyone say with
certainty that Rust got these mappings right or wrong for all possible
C compilers/architectures/platforms? If so (which I highly doubt)
could someone provide a link?

unsigned long ha;
} xrecord_t;

typedef struct s_xdfile {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Phillip Wood wrote (reply to this):

Hi Ezekiel

On 29/10/2025 22:19, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <[email protected]>
> > ssize_t is appropriate for dstart and dend because they both describe
> positive or negative offsets relative to a pointer.

This paragraph and the subject need updating to match the change from ssize_t to ptrdiff_t.

> A future patch will move these fields to a different struct. Moving
> them to the end of xdfile_t now, means the field order of xdfile_t will
> be disturbed less.

I'm not sure why that matters but I also don't object

Thanks

Phillip

> Signed-off-by: Ezekiel Newren <[email protected]>
> ---
>   xdiff/xtypes.h | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> > diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
> index f145abba3e..7c8c057bca 100644
> --- a/xdiff/xtypes.h
> +++ b/xdiff/xtypes.h
> @@ -47,10 +47,10 @@ typedef struct s_xrecord {
>   typedef struct s_xdfile {
>   	xrecord_t *recs;
>   	long nrec;
> -	long dstart, dend;
>   	bool *changed;
>   	long *rindex;
>   	long nreff;
> +	ptrdiff_t dstart, dend;
>   } xdfile_t;
>   >   typedef struct s_xdfenv {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ezekiel Newren wrote (reply to this):

On Thu, Nov 6, 2025 at 2:55 AM Phillip Wood <[email protected]> wrote:
>
> Hi Ezekiel
>
> On 29/10/2025 22:19, Ezekiel Newren via GitGitGadget wrote:
> > From: Ezekiel Newren <[email protected]>
> >
> > ssize_t is appropriate for dstart and dend because they both describe
> > positive or negative offsets relative to a pointer.
>
> This paragraph and the subject need updating to match the change from
> ssize_t to ptrdiff_t.

You're right. I thought I updated that. I'll make that change for the
next version.

static int get_indent(xrecord_t *rec)
{
long i;
int ret = 0;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Phillip Wood wrote (reply to this):

Hi Ezekiel

On 29/10/2025 22:19, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <[email protected]>
> > Rust uses u8 to refer to bytes in memory. Since xrecord_t.ptr is also
> referring to bytes in memory, rather than Unicode code points, use
> uint8_t instead of char.

The reference to unicode code points here still makes no sense to me. I thought the reason for the conversion was to match rust's u8.

> Every usage of this field was inspected and cast to char*, or similar,
> to avoid signedness warnings/errors from the compiler. Casting was used
> so that the whole of xdiff doesn't need to be refactored in order to
> change the type of this field.

Thanks for adding this. Having played a little with changing some function parameters to avoid adding these casts I agree this patch is a good place to stop as the number of changes required quickly spiraled out of control.

Thanks

Phillip

> Signed-off-by: Ezekiel Newren <[email protected]>
> ---
>   xdiff/xdiffi.c    |  8 ++++----
>   xdiff/xemit.c     |  6 +++---
>   xdiff/xmerge.c    | 14 +++++++-------
>   xdiff/xpatience.c |  2 +-
>   xdiff/xprepare.c  |  8 ++++----
>   xdiff/xtypes.h    |  2 +-
>   xdiff/xutils.c    |  4 ++--
>   7 files changed, 22 insertions(+), 22 deletions(-)
> > diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
> index 6f3998ee54..411a8aa69f 100644
> --- a/xdiff/xdiffi.c
> +++ b/xdiff/xdiffi.c
> @@ -407,7 +407,7 @@ static int get_indent(xrecord_t *rec)
>   	int ret = 0;
>   >   	for (i = 0; i < rec->size; i++) {
> -		char c = rec->ptr[i];
> +		uint8_t c = rec->ptr[i];
>   >   		if (!XDL_ISSPACE(c))
>   			return ret;
> @@ -993,11 +993,11 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags)
>   >   		rec = &xe->xdf1.recs[xch->i1];
>   		for (i = 0; i < xch->chg1 && ignore; i++)
> -			ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
> +			ignore = xdl_blankline((const char *)rec[i].ptr, rec[i].size, flags);
>   >   		rec = &xe->xdf2.recs[xch->i2];
>   		for (i = 0; i < xch->chg2 && ignore; i++)
> -			ignore = xdl_blankline(rec[i].ptr, rec[i].size, flags);
> +			ignore = xdl_blankline((const char *)rec[i].ptr, rec[i].size, flags);
>   >   		xch->ignore = ignore;
>   	}
> @@ -1008,7 +1008,7 @@ static int record_matches_regex(xrecord_t *rec, xpparam_t const *xpp) {
>   	size_t i;
>   >   	for (i = 0; i < xpp->ignore_regex_nr; i++)
> -		if (!regexec_buf(xpp->ignore_regex[i], rec->ptr, rec->size, 1,
> +		if (!regexec_buf(xpp->ignore_regex[i], (const char *)rec->ptr, rec->size, 1,
>   				 &regmatch, 0))
>   			return 1;
>   > diff --git a/xdiff/xemit.c b/xdiff/xemit.c
> index b2f1f30cd3..ead930088a 100644
> --- a/xdiff/xemit.c
> +++ b/xdiff/xemit.c
> @@ -27,7 +27,7 @@ static int xdl_emit_record(xdfile_t *xdf, long ri, char const *pre, xdemitcb_t *
>   {
>   	xrecord_t *rec = &xdf->recs[ri];
>   > -	if (xdl_emit_diffrec(rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
> +	if (xdl_emit_diffrec((char const *)rec->ptr, rec->size, pre, strlen(pre), ecb) < 0)
>   		return -1;
>   >   	return 0;
> @@ -113,8 +113,8 @@ static long match_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri,
>   	xrecord_t *rec = &xdf->recs[ri];
>   >   	if (!xecfg->find_func)
> -		return def_ff(rec->ptr, rec->size, buf, sz);
> -	return xecfg->find_func(rec->ptr, rec->size, buf, sz, xecfg->find_func_priv);
> +		return def_ff((const char *)rec->ptr, rec->size, buf, sz);
> +	return xecfg->find_func((const char *)rec->ptr, rec->size, buf, sz, xecfg->find_func_priv);
>   }
>   >   static int is_func_rec(xdfile_t *xdf, xdemitconf_t const *xecfg, long ri)
> diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
> index fd600cbb5d..75cb3e76a2 100644
> --- a/xdiff/xmerge.c
> +++ b/xdiff/xmerge.c
> @@ -101,8 +101,8 @@ static int xdl_merge_cmp_lines(xdfenv_t *xe1, int i1, xdfenv_t *xe2, int i2,
>   	xrecord_t *rec2 = xe2->xdf2.recs + i2;
>   >   	for (i = 0; i < line_count; i++) {
> -		int result = xdl_recmatch(rec1[i].ptr, rec1[i].size,
> -			rec2[i].ptr, rec2[i].size, flags);
> +		int result = xdl_recmatch((const char *)rec1[i].ptr, rec1[i].size,
> +			(const char *)rec2[i].ptr, rec2[i].size, flags);
>   		if (!result)
>   			return -1;
>   	}
> @@ -324,8 +324,8 @@ static int xdl_fill_merge_buffer(xdfenv_t *xe1, const char *name1,
>   >   static int recmatch(xrecord_t *rec1, xrecord_t *rec2, unsigned long flags)
>   {
> -	return xdl_recmatch(rec1->ptr, rec1->size,
> -			    rec2->ptr, rec2->size, flags);
> +	return xdl_recmatch((const char *)rec1->ptr, rec1->size,
> +			    (const char *)rec2->ptr, rec2->size, flags);
>   }
>   >   /*
> @@ -382,10 +382,10 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
>   		 * we have a very simple mmfile structure.
>   		 */
>   		t1.ptr = (char *)xe1->xdf2.recs[m->i1].ptr;
> -		t1.size = xe1->xdf2.recs[m->i1 + m->chg1 - 1].ptr
> +		t1.size = (char *)xe1->xdf2.recs[m->i1 + m->chg1 - 1].ptr
>   			+ xe1->xdf2.recs[m->i1 + m->chg1 - 1].size - t1.ptr;
>   		t2.ptr = (char *)xe2->xdf2.recs[m->i2].ptr;
> -		t2.size = xe2->xdf2.recs[m->i2 + m->chg2 - 1].ptr
> +		t2.size = (char *)xe2->xdf2.recs[m->i2 + m->chg2 - 1].ptr
>   			+ xe2->xdf2.recs[m->i2 + m->chg2 - 1].size - t2.ptr;
>   		if (xdl_do_diff(&t1, &t2, xpp, &xe) < 0)
>   			return -1;
> @@ -440,7 +440,7 @@ static int line_contains_alnum(const char *ptr, long size)
>   static int lines_contain_alnum(xdfenv_t *xe, int i, int chg)
>   {
>   	for (; chg; chg--, i++)
> -		if (line_contains_alnum(xe->xdf2.recs[i].ptr,
> +		if (line_contains_alnum((const char *)xe->xdf2.recs[i].ptr,
>   				xe->xdf2.recs[i].size))
>   			return 1;
>   	return 0;
> diff --git a/xdiff/xpatience.c b/xdiff/xpatience.c
> index 669b653580..bb61354f22 100644
> --- a/xdiff/xpatience.c
> +++ b/xdiff/xpatience.c
> @@ -121,7 +121,7 @@ static void insert_record(xpparam_t const *xpp, int line, struct hashmap *map,
>   		return;
>   	map->entries[index].line1 = line;
>   	map->entries[index].hash = record->ha;
> -	map->entries[index].anchor = is_anchor(xpp, map->env->xdf1.recs[line - 1].ptr);
> +	map->entries[index].anchor = is_anchor(xpp, (const char *)map->env->xdf1.recs[line - 1].ptr);
>   	if (!map->first)
>   		map->first = map->entries + index;
>   	if (map->last) {
> diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
> index 192334f1b7..4cb18b2b88 100644
> --- a/xdiff/xprepare.c
> +++ b/xdiff/xprepare.c
> @@ -99,8 +99,8 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
>   	hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
>   	for (rcrec = cf->rchash[hi]; rcrec; rcrec = rcrec->next)
>   		if (rcrec->rec.ha == rec->ha &&
> -				xdl_recmatch(rcrec->rec.ptr, rcrec->rec.size,
> -					rec->ptr, rec->size, cf->flags))
> +				xdl_recmatch((const char *)rcrec->rec.ptr, rcrec->rec.size,
> +					(const char *)rec->ptr, rec->size, cf->flags))
>   			break;
>   >   	if (!rcrec) {
> @@ -156,8 +156,8 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
>   			if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
>   				goto abort;
>   			crec = &xdf->recs[xdf->nrec++];
> -			crec->ptr = prev;
> -			crec->size = (long) (cur - prev);
> +			crec->ptr = (uint8_t const *)prev;
> +			crec->size =(long) ( cur - prev);
>   			crec->ha = hav;
>   			if (xdl_classify_record(pass, cf, crec) < 0)
>   				goto abort;
> diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
> index 7c8c057bca..b1c520a378 100644
> --- a/xdiff/xtypes.h
> +++ b/xdiff/xtypes.h
> @@ -39,7 +39,7 @@ typedef struct s_chastore {
>   } chastore_t;
>   >   typedef struct s_xrecord {
> -	char const *ptr;
> +	uint8_t const *ptr;
>   	long size;
>   	unsigned long ha;
>   } xrecord_t;
> diff --git a/xdiff/xutils.c b/xdiff/xutils.c
> index 447e66c719..7be063bfb6 100644
> --- a/xdiff/xutils.c
> +++ b/xdiff/xutils.c
> @@ -465,10 +465,10 @@ int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
>   	xdfenv_t env;
>   >   	subfile1.ptr = (char *)diff_env->xdf1.recs[line1 - 1].ptr;
> -	subfile1.size = diff_env->xdf1.recs[line1 + count1 - 2].ptr +
> +	subfile1.size = (char *)diff_env->xdf1.recs[line1 + count1 - 2].ptr +
>   		diff_env->xdf1.recs[line1 + count1 - 2].size - subfile1.ptr;
>   	subfile2.ptr = (char *)diff_env->xdf2.recs[line2 - 1].ptr;
> -	subfile2.size = diff_env->xdf2.recs[line2 + count2 - 2].ptr +
> +	subfile2.size = (char *)diff_env->xdf2.recs[line2 + count2 - 2].ptr +
>   		diff_env->xdf2.recs[line2 + count2 - 2].size - subfile2.ptr;
>   	if (xdl_do_diff(&subfile1, &subfile2, xpp, &env) < 0)
>   		return -1;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ezekiel Newren wrote (reply to this):

On Thu, Nov 6, 2025 at 3:49 AM Phillip Wood <[email protected]> wrote:
>
> Hi Ezekiel
>
> On 29/10/2025 22:19, Ezekiel Newren via GitGitGadget wrote:
> > From: Ezekiel Newren <[email protected]>
> >
> > Rust uses u8 to refer to bytes in memory. Since xrecord_t.ptr is also
> > referring to bytes in memory, rather than Unicode code points, use
> > uint8_t instead of char.
>
> The reference to unicode code points here still makes no sense to me. I
> thought the reason for the conversion was to match rust's u8.

It is to match Rust's u8 type, but I was also trying to convey that
ptr is referring to bytes and not characters _because_ xdiff performs
textual differences. It's not spelled out anywhere in Xdiff that it
does or doesn't take Unicode into consideration. Would comparing
Unicode code points change how Xdiff behaves? Should it behave
differently? I don't know. My understanding is that whether the bytes
are utf-8, utf-16le, utf-16be, or some other encoding of Unicode.
Xdiff doesn't care and treats the lines in a file as raw byte strings.

There's also the question of "Should the Rust side of Xdiff treat
lines in a file as &[u8] or &str?" The reason why this matters is
because in order to get a &str from &[u8] in Rust you need to call a
function like:

```
let raw_bytes = b"abc\n";
let result = std::str::from_utf8(raw_bytes);
if let Ok(line) = result {
    // do something
}
```

What happens if it's not utf8 encoded? What if it's malformed utf8? To
avoid these problems I only use &[u8] in xdiff and perform differences
on raw byte strings rather than considering Unicode at all like how
Xdiff already does.

Does that explain my comment about Unicode or does it still seem out
of place to you? I can remove the mention of Unicode from the commit
message if this still doesn't make any sense to you.

> > Every usage of this field was inspected and cast to char*, or similar,
> > to avoid signedness warnings/errors from the compiler. Casting was used
> > so that the whole of xdiff doesn't need to be refactored in order to
> > change the type of this field.
>
> Thanks for adding this. Having played a little with changing some
> function parameters to avoid adding these casts I agree this patch is a
> good place to stop as the number of changes required quickly spiraled
> out of control.

I'm not excited about the casts either, but these 2 structs are
fundamental to how Xdiff passes data around, and so they need to be
FFI friendly. I don't plan on converting other structs or function
signatures in Xdiff unless I really have to.

static int get_indent(xrecord_t *rec)
{
long i;
int ret = 0;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Phillip Wood wrote (reply to this):

On 29/10/2025 22:19, Ezekiel Newren via GitGitGadget wrote:
> @@ -156,8 +156,8 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
>   			if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
>   				goto abort;
>   			crec = &xdf->recs[xdf->nrec++];
> -			crec->ptr = prev;
> -			crec->size = (long) (cur - prev);
> +			crec->ptr = (uint8_t const *)prev;
> +			crec->size =(long) ( cur - prev);

The changes to crec->size here look unintentional

Thanks

Phillip

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ezekiel Newren wrote (reply to this):

On Thu, Nov 6, 2025 at 3:55 AM Phillip Wood <[email protected]> wrote:
>
> On 29/10/2025 22:19, Ezekiel Newren via GitGitGadget wrote:
> > @@ -156,8 +156,8 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
> >                       if (XDL_ALLOC_GROW(xdf->recs, xdf->nrec + 1, narec))
> >                               goto abort;
> >                       crec = &xdf->recs[xdf->nrec++];
> > -                     crec->ptr = prev;
> > -                     crec->size = (long) (cur - prev);
> > +                     crec->ptr = (uint8_t const *)prev;
> > +                     crec->size =(long) ( cur - prev);
>
> The changes to crec->size here look unintentional

I agree. I'll change that.

* Davide Libenzi <[email protected]>
*
*/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Phillip Wood wrote (reply to this):

Hi Ezekiel

On 29/10/2025 22:19, Ezekiel Newren via GitGitGadget wrote:
> From: Ezekiel Newren <[email protected]>
> > The ha field is serving two different purposes, which makes the code
> harder to read. At first glance it looks like many places assume
> there could never be hash collisions between lines of the two input
> files. In reality, line_hash is used together with xdl_recmatch() to
> ensure correct comparisons of lines, even when collisions occur.
> > To make this clearer, the old ha field has been split:
>    * line_hash: The straightforward hash of a line, requiring no
>      additional context.
>    * minimal_perfect_hash: Not a new concept, but now a separate
>      field. It comes from the classifier's general-purpose hash table,
>      which assigns each line a unique and minimal hash across the two
>      files.

It would be nice to explain the differing types for the two fields in the commit message.
> diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
> index 85e56021da..16236bd045 100644
> --- a/xdiff/xprepare.c
> +++ b/xdiff/xprepare.c
> @@ -96,9 +96,9 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
>   	long hi;
>   	xdlclass_t *rcrec;
>   > -	hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
> +	hi = (long) XDL_HASHLONG(rec->line_hash, cf->hbits);

"hi" is only used as an array index so it might be nicer to change it to size_t and avoid this cast instead.

Thanks

Phillip

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ezekiel Newren wrote (reply to this):

On Thu, Nov 6, 2025 at 4:00 AM Phillip Wood <[email protected]> wrote:
>
> Hi Ezekiel
>
> On 29/10/2025 22:19, Ezekiel Newren via GitGitGadget wrote:
> > From: Ezekiel Newren <[email protected]>
> >
> > The ha field is serving two different purposes, which makes the code
> > harder to read. At first glance it looks like many places assume
> > there could never be hash collisions between lines of the two input
> > files. In reality, line_hash is used together with xdl_recmatch() to
> > ensure correct comparisons of lines, even when collisions occur.
> >
> > To make this clearer, the old ha field has been split:
> >    * line_hash: The straightforward hash of a line, requiring no
> >      additional context.
> >    * minimal_perfect_hash: Not a new concept, but now a separate
> >      field. It comes from the classifier's general-purpose hash table,
> >      which assigns each line a unique and minimal hash across the two
> >      files.
>
> It would be nice to explain the differing types for the two fields in
> the commit message.

I'll add something like:
line_hash is a uint64_t because it is the output of a fixed width hash
function. minimal_perfect_hash is size_t because its purpose is to
index into an array. This also avoids the problem of having to cast to
usize on the Rust side every time minimal_perfect_hash is used to
index a slice.

> > diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
> > index 85e56021da..16236bd045 100644
> > --- a/xdiff/xprepare.c
> > +++ b/xdiff/xprepare.c
> > @@ -96,9 +96,9 @@ static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
> >       long hi;
> >       xdlclass_t *rcrec;
> >
> > -     hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
> > +     hi = (long) XDL_HASHLONG(rec->line_hash, cf->hbits);
>
> "hi" is only used as an array index so it might be nicer to change it to
> size_t and avoid this cast instead.

I agree. I'll make that change.

@gitgitgadget-git
Copy link

There was a status update in the "Cooking" section about the branch en/xdiff-cleanup-2 on the Git mailing list:

Code clean-up.

Comments?
source: <[email protected]>

@gitgitgadget-git
Copy link

This patch series was integrated into seen via 95076f7.

@gitgitgadget-git
Copy link

This patch series was integrated into seen via 4df9b84.

@gitgitgadget-git
Copy link

This patch series was integrated into seen via fd226a8.

@gitgitgadget-git
Copy link

This patch series was integrated into seen via 4a03c00.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant