Thanks to visit codestin.com
Credit goes to github.com

Skip to content

stacktrace; pump stream fails when commit diff is on a file with a colon in the name #1210

Open
@tommyjcarpenter

Description

@tommyjcarpenter

Hi

We use this code to check whether a git repo contains only "new" files or whether a file was modified:

def git_check_added_only(repo: Repo):
    prev_commit = repo.commit("HEAD~1")
    head_commit = repo.head.commit
    modified_files = list(prev_commit.diff(head_commit).iter_change_type("M"))
    if len(modified_files) > 0:
        ...

When running this on repos with only csv data, it's fine.

However when running this on repos with binary files in it, like .xlxs, we blow in a stack trace like this; but because it's threaded I am not sure which exact line above is internally causing this "pump"

b"ERROR:git.cmd:Pumping 'stdout' of cmd(['git', 'diff-tree', '8fd339d63f824428d215be363817d637ffae3430', 'ca2fecd4f638c81af31ef7e1a22d48b68a55d746', '-r', '--abbrev=40', '--full-index', '-M', '--raw', '-z', '--no-color']) failed due to: ValueError('not enough values to unpack (expected 5, got 4)')\n"
b'Exception in thread Thread-25:\n'
b'Traceback (most recent call last):\n'
b'  File "/home/mypackage/.local/lib/python3.7/site-packages/git/cmd.py", line 83, in pump_stream\n'
b'    handler(line)\n'
b'  File "/home/mypackage/.local/lib/python3.7/site-packages/git/diff.py", line 488, in handle_diff_line\n'
b'    old_mode, new_mode, a_blob_id, b_blob_id, _change_type = meta.split(None, 4)\n'
b'ValueError: not enough values to unpack (expected 5, got 4)\n'
b'\n'
b'The above exception was the direct cause of the following exception:\n'
b'\n'
b'Traceback (most recent call last):\n'
b'  File "/usr/local/lib/python3.7/threading.py", line 926, in _bootstrap_inner\n'
b'    self.run()\n'
b'  File "/usr/local/lib/python3.7/threading.py", line 870, in run\n'
b'    self._target(*self._args, **self._kwargs)\n'
b'  File "/home/mypackage/.local/lib/python3.7/site-packages/git/cmd.py", line 86, in pump_stream\n'
b"    raise CommandError(['<%s-pump>' % name] + cmdline, ex) from ex\n"
b"git.exc.CommandError: Cmd('<stdout-pump>') failed due to: ValueError('not enough values to unpack (expected 5, got 4)')\n"
b'  cmdline: <stdout-pump> git diff-tree 8fd339d63f824428d215be363817d637ffae3430 ca2fecd4f638c81af31ef7e1a22d48b68a55d746 -r --abbrev=40 --full-index -M --raw -z --no-color\n'
b'\n'

Near that line of code I see https://github.com/gitpython-developers/GitPython/blob/main/git/diff.py#L498

        # handles
        # :100644 100644 687099101... 37c5e30c8... M    .gitignore

I do not know how to get to this output on my git terminal. I was going to try to make a test case here seeing if this looks different for binary files (basically make a PDF or an excel file, then edit it, then commit it; but reading the log doesnt show such a format)

We do not pin the version of gitpython and we rebuild this docker container fairly often so this should be running with the latest version in pypi (488 on your master branch does not coorespond to that line of code; however you've had commits to master since the pypi release)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions