feat: support for several slices of the same package#134
feat: support for several slices of the same package#134niemeyer merged 25 commits intocanonical:mainfrom
Conversation
In slicer.go, the slice paths are grouped by package names and then for
each package the overall contents are extracted together. While this is
fine, due to existing code, it would appear as in all of those paths are
coming from a single slice. Later, when adding the paths to the report,
it fails the check that those paths are truly not coming from that
slice. Hence, those paths do not get added to the report.
For example, in the following slice definition, if the "manifest" slice
appears first in the sorted order, the path "/dir/file" does not get
added to the report. Although it is extracted, it is not coming from
"manifest" -- thus the check fails and it is not added to report later.
package: test-package
slices:
myslice:
contents:
/dir/file:
/dir/file-copy: {copy: /dir/file}
/other-dir/file: {symlink: ../dir/file}
/dir/text-file: {text: data1}
/dir/foo/bar/: {make: true, mode: 01777}
manifest:
contents:
/var/lib/foo/**: {generate: manifest}
This commit fixes the bug.
Globs that overlap with another glob or a file in the same package are now supported because they are extracting the same content, so there is no ambiguity.
3df80ea to
b701ccd
Compare
internal/deb/extract_test.go
Outdated
| "/dir/several/": "dir 0755", | ||
| }, | ||
| notCreated: []string{"/dir/"}, | ||
| // TODO discuss in PR about this. I think this is something we want. |
There was a problem hiding this comment.
[Comment for reviewer]: The change here is that when we have a glob like /dir/** we used to take the permissions from the tarball only for the subfolders matched by ** whereas now we also use the tarball permissions for /dir/ itself.
There was a problem hiding this comment.
We need to consider the overall logic to make a decision about this.
What happens in the following cases, right now, and in the suggested change? Is it consistent?
- /foo/bar
- /foo/b*
- /foo/*
- /foo/b**
- /foo/**
There was a problem hiding this comment.
Yes, it is consistent. Basically we were doing (line 185 old extract.go):
globPath, ok := shouldExtract(sourcePath)
if !ok {
continue
}
// then: record parent directory permissions.and now we are recording the directory permissions before seeing if we have to extract the path. This is the way to go because we want to preserver permissions as much as possible.
Incidentally, the previous code was working because shouldExtract use to check the prefix so it would return true for extraction. This was confusing because it was only checking the prefix of non-glob paths. Now it is much more consistent because we store all the parent folder permissions and then we create them all regardless of whether it is a glob or a regular path.
niemeyer
left a comment
There was a problem hiding this comment.
This is looking good, Alberto, thank you.
I have quite a few comments, but it's all on the polishing side rather than fundamental disagreements.
internal/slicer/slicer.go
Outdated
| if pathInfo.Until == setup.UntilMutate && until == unset { | ||
| until = mutate | ||
| } else if pathInfo.Until == setup.UntilNone { | ||
| until = keep |
There was a problem hiding this comment.
We can simplify all of that logic by initializing before the loop with:
until := setup.UntilMutate
and here just doing:
if pathInfo.Until < until {
until = pathInfo.Until
}
That is, we initialize with the soonest the path may be removed, and go all the way up to it not being removed. So it's not only simpler, but this logic will also mostly survive if we introduce new stages in which files are removed. We just need to be careful to introduce the enums in order of latest survival first. UntilNone is really "keep", per the suggested enum. We might even rename it to make that more clear.
With this, the local type, etc, can go away as well.
There was a problem hiding this comment.
I have made the changes to use setup.Until*, to remove the local type and to make the checks as simple as possible. However, if I am understanding your suggestion correctly, using < to compare setup.Until* seems a little bit brittle because we would be comparing strings present in the YAML in lexicographic order, meaning when we introduce a new value for until: we would have to take that into account.
I have instead kept the manual check but I am happy to discuss it further.
Lastly, when we do the refactor in #134 (comment) we can deduplicate the logic and create a wrapper type and/or a helper function. But for this PR, I am keeping it like this.
internal/slicer/slicer.go
Outdated
| if o.Mode.IsDir() { | ||
| relPath = relPath + "/" | ||
| } | ||
| listed := false |
There was a problem hiding this comment.
s/listed/sliceContents/, assuming that's what it means. If it's not, let's please have a variable name that we can remember when we come back here in a few months.
There was a problem hiding this comment.
listed is meant to represent whether the extractInfo is listed explicitly in the contents of some slice (for example copyright has an extractInfo entry but it is not listed anywhere). To reflect that I have changed the name in 3a548f6 to be inSliceContents, is that clear enough?
| } | ||
| if listed { | ||
| untilPaths[relPath] = until | ||
| addKnownPath(relPath) |
There was a problem hiding this comment.
Looking at this makes me think that we don't need both of these constructs. Every path that is known means it's inside the target and might be read. THe fact it might be read is also the reason why we keep it around for some time, and that time is determined by the until value. Thus, it makes sense to have both of them together, perhaps even exporting them up to small little type so some of the logic in this long function is taken out. If so, we should also move the aggregation between two until values to the function itself, instead of having that logic duplicated in multiple places.
There was a problem hiding this comment.
As discussed offline, we will do the refactor but not as part of this PR which is already very long and which you have already spent quite a bit of time reviewing. We will do the necessary changes to polish the code here and keep the refactors for the next PR.
niemeyer
left a comment
There was a problem hiding this comment.
Thanks for the changes.
| } else if pathInfo.Until == setup.UntilNone { | ||
| until = keep | ||
| if pathInfo.Until == setup.UntilNone { | ||
| until = setup.UntilNone |
There was a problem hiding this comment.
For the record, this looks a bit different from the suggested. Not a big deal for now as the result is the same, but this version doesn't work once we have more than two options for this value.
Due to several bugs and several limitations in the code, not all cases of selecting several slices of the same package worked. Specifically, we were not reporting the second slice correctly, which meant that the scripts could not select those paths. Additionally, there were several more problems when the slices had overlapping globs (which are valid in the same package). Mainly, the whole
untilcalculation had to be redone, and the extractor had to be refactored to group by destination path, where before it was assuming there was only one.