-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Azure artifact store: fix path resolution error when artifact root is container root #769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| for r in results: | ||
| if isinstance(r, BlobPrefix): # This is a prefix for items in a subdirectory | ||
| subdir = r.name[len(artifact_path)+1:] | ||
| subdir = os.path.relpath(path=r.name, start=artifact_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the stored r.name always a full path starting after blob.core.windows.net/? If not os.path.relpath gives wrong answers.
>>> os.path.relpath("a/b/c/", "p/q/r")
'../../../a/b/c'
even this one..
>>> os.path.relpath("a/b/c/", "b/c/p/q/r")
'../../../../../a/b/c'
Let's assert that it does in this code. Also add a unit test for this case. So if this is broken in future we know...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r.name is always a path containing the prefix specified by the prefix parameter of self.client.list_blobs(). In this case, we pass a prefix value of prefix = posixpath.join(artifact_path, <subdirectory_path>).
We then take the relative path with start=artifact_path. Because prefix.startswith(artifact_path), this should work in all documented cases. For reference, see the documentation for the prefix parameter of the azure storage list_blobs Python API method.
I've added a check to verify that the listed blob's have names beginning with the specified prefix. Let's discuss the structure of the additional unit test!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a test case to ensure that the new exception is raised when listed blob's do not contain the artifact path prefix!
|
@mparkhe Addressed your comments. Thanks! |
mparkhe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank for adding all these unit test.
| subdir = r.name[len(artifact_path)+1:] | ||
| # Separator needs to be fixed as '/' because of azure blob storage pattern. | ||
| # Do not change to os.relpath because in Windows system path separator is '\' | ||
| subdir = posixpath.relpath(path=r.name, start=artifact_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
posixpath.relpath wrangling is common for both if ... else paths rite? Can that be shared? You can take care of that later one... won't block you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! Good catch - thanks!
When attempting to download a directory artifact from an Azure-based artifact repository, paths are truncated if the repository's artifact URI is the URI of the Azure blob container root.
In this case, the parsed Azure
artifact_pathvalue is/, which has a length of1. When we attempt to slice an absolute directory path from the indexlen(artifact_path) + 1, we slice from index 2. This erroneously removes the first character of the relative path.os.path.relpath was designed to handle these kinds of edge cases. This PR modifies the Azure artifact store to use
os.path.relpathinstead.