Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@dbczumar
Copy link
Collaborator

@dbczumar dbczumar commented Dec 20, 2018

When attempting to download a directory artifact from an Azure-based artifact repository, paths are truncated if the repository's artifact URI is the URI of the Azure blob container root.

In this case, the parsed Azure artifact_path value is /, which has a length of 1. When we attempt to slice an absolute directory path from the index len(artifact_path) + 1, we slice from index 2. This erroneously removes the first character of the relative path.

os.path.relpath was designed to handle these kinds of edge cases. This PR modifies the Azure artifact store to use os.path.relpath instead.

@dbczumar dbczumar requested a review from mparkhe December 20, 2018 23:24
for r in results:
if isinstance(r, BlobPrefix): # This is a prefix for items in a subdirectory
subdir = r.name[len(artifact_path)+1:]
subdir = os.path.relpath(path=r.name, start=artifact_path)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the stored r.name always a full path starting after blob.core.windows.net/? If not os.path.relpath gives wrong answers.

>>> os.path.relpath("a/b/c/", "p/q/r")
'../../../a/b/c'

even this one..

>>> os.path.relpath("a/b/c/", "b/c/p/q/r")
'../../../../../a/b/c'

Let's assert that it does in this code. Also add a unit test for this case. So if this is broken in future we know...

Copy link
Collaborator Author

@dbczumar dbczumar Dec 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r.name is always a path containing the prefix specified by the prefix parameter of self.client.list_blobs(). In this case, we pass a prefix value of prefix = posixpath.join(artifact_path, <subdirectory_path>).

We then take the relative path with start=artifact_path. Because prefix.startswith(artifact_path), this should work in all documented cases. For reference, see the documentation for the prefix parameter of the azure storage list_blobs Python API method.

I've added a check to verify that the listed blob's have names beginning with the specified prefix. Let's discuss the structure of the additional unit test!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test case to ensure that the new exception is raised when listed blob's do not contain the artifact path prefix!

@dbczumar
Copy link
Collaborator Author

@mparkhe Addressed your comments. Thanks!

Copy link
Collaborator

@mparkhe mparkhe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank for adding all these unit test.

subdir = r.name[len(artifact_path)+1:]
# Separator needs to be fixed as '/' because of azure blob storage pattern.
# Do not change to os.relpath because in Windows system path separator is '\'
subdir = posixpath.relpath(path=r.name, start=artifact_path)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

posixpath.relpath wrangling is common for both if ... else paths rite? Can that be shared? You can take care of that later one... won't block you

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! Good catch - thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants