Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Impossible to clone to path with unicode #920

Closed
@mikicz

Description

@mikicz

Hi, in integration of GitPython I ran into an issue with cloning into directories that have an unicode name. This was an issue with version 2.1.9 and has not been fixed with upgrade to 3.0.2.

I am using Python 3.7.

Basically, if you pass a str with some unicode to e.g. Repo.clone_from than the package throws an UnicodeEncodeError, it seems in processing of the output of the command.

    return Repo.clone_from(repo, repo_path, branch=branch, **kwargs)
venv/lib64/python3.7/site-packages/git/repo/base.py:1023: in clone_from
    return cls._clone(git, url, to_path, GitCmdObjectDB, progress, multi_options, **kwargs)
venv/lib64/python3.7/site-packages/git/repo/base.py:969: in _clone
    finalize_process(proc, stderr=stderr)
venv/lib64/python3.7/site-packages/git/util.py:333: in finalize_process
    proc.wait(**kwargs)
venv/lib64/python3.7/site-packages/git/cmd.py:399: in wait
    stderr = force_bytes(stderr)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

data = "Cloning into '/tmp/arca/test/abčď/repos/_tmp_tmp1f9n_j71299f859f6fac750147aec65a9992f8e289ef42177ff1b234677764b2d5c61560/master'...\n", encoding = 'ascii'

    def force_bytes(data, encoding="ascii"):
        if isinstance(data, bytes):
            return data
    
        if isinstance(data, string_types):
>           return data.encode(encoding)
E           UnicodeEncodeError: 'ascii' codec can't encode characters in position 31-32: ordinal not in range(128)

Since GitPython is Python 3+ only now, it would make make sense to set the default encoding in force_bytes to utf-8, that actually fixes the issue when I try it. This was proposed in gitpython-developers/gitdb#48 and or gitpython-developers/gitdb#49, but it's been a while since those have been proposed.

Maybe an another solution would be for the use to be able to select the default encoding somehow, as not to break previous cases, but to provide a solution for this issue?

This is related to #761, which seems to be stale at the moment. I'm raising the issue again since it's still an problem in the new version of GitPython which is Python 3+.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions