Thanks to visit codestin.com
Credit goes to github.com

Skip to content

feat: support for several slices of the same package#134

Merged
niemeyer merged 25 commits intocanonical:mainfrom
letFunny:bug-fix-several-slices-of-same-package
Jun 3, 2024
Merged

feat: support for several slices of the same package#134
niemeyer merged 25 commits intocanonical:mainfrom
letFunny:bug-fix-several-slices-of-same-package

Conversation

@letFunny
Copy link
Collaborator

@letFunny letFunny commented May 3, 2024

Due to several bugs and several limitations in the code, not all cases of selecting several slices of the same package worked. Specifically, we were not reporting the second slice correctly, which meant that the scripts could not select those paths. Additionally, there were several more problems when the slices had overlapping globs (which are valid in the same package). Mainly, the whole until calculation had to be redone, and the extractor had to be refactored to group by destination path, where before it was assuming there was only one.

  • Have you signed the CLA?

In slicer.go, the slice paths are grouped by package names and then for
each package the overall contents are extracted together. While this is
fine, due to existing code, it would appear as in all of those paths are
coming from a single slice. Later, when adding the paths to the report,
it fails the check that those paths are truly not coming from that
slice. Hence, those paths do not get added to the report.

For example, in the following slice definition, if the "manifest" slice
appears first in the sorted order, the path "/dir/file" does not get
added to the report. Although it is extracted, it is not coming from
"manifest" -- thus the check fails and it is not added to report later.

    package: test-package
    slices:
	myslice:
	    contents:
		/dir/file:
		/dir/file-copy:  {copy: /dir/file}
		/other-dir/file: {symlink: ../dir/file}
		/dir/text-file:  {text: data1}
		/dir/foo/bar/:   {make: true, mode: 01777}
	manifest:
	    contents:
		/var/lib/foo/**: {generate: manifest}

This commit fixes the bug.
@letFunny letFunny changed the title bugfix: multiple slices of same package produces faulty report bugfix: scripts cannot access paths from packages with more than one slice May 3, 2024
letFunny added 3 commits May 8, 2024 10:16
Globs that overlap with another glob or a file in the same package are
now supported because they are extracting the same content, so there is
no ambiguity.
@letFunny letFunny force-pushed the bug-fix-several-slices-of-same-package branch from 3df80ea to b701ccd Compare May 9, 2024 08:04
@letFunny letFunny changed the title bugfix: scripts cannot access paths from packages with more than one slice feat: support for several slices of the same package May 21, 2024
@letFunny letFunny added the Priority Look at me first label May 21, 2024
@letFunny letFunny requested a review from niemeyer May 23, 2024 13:26
"/dir/several/": "dir 0755",
},
notCreated: []string{"/dir/"},
// TODO discuss in PR about this. I think this is something we want.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Comment for reviewer]: The change here is that when we have a glob like /dir/** we used to take the permissions from the tarball only for the subfolders matched by ** whereas now we also use the tarball permissions for /dir/ itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to consider the overall logic to make a decision about this.

What happens in the following cases, right now, and in the suggested change? Is it consistent?

  • /foo/bar
  • /foo/b*
  • /foo/*
  • /foo/b**
  • /foo/**

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is consistent. Basically we were doing (line 185 old extract.go):

globPath, ok := shouldExtract(sourcePath)
if !ok {
	continue
}
// then: record parent directory permissions.

and now we are recording the directory permissions before seeing if we have to extract the path. This is the way to go because we want to preserver permissions as much as possible.

Incidentally, the previous code was working because shouldExtract use to check the prefix so it would return true for extraction. This was confusing because it was only checking the prefix of non-glob paths. Now it is much more consistent because we store all the parent folder permissions and then we create them all regardless of whether it is a glob or a regular path.

Copy link
Contributor

@niemeyer niemeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good, Alberto, thank you.

I have quite a few comments, but it's all on the polishing side rather than fundamental disagreements.

if pathInfo.Until == setup.UntilMutate && until == unset {
until = mutate
} else if pathInfo.Until == setup.UntilNone {
until = keep
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can simplify all of that logic by initializing before the loop with:

until := setup.UntilMutate

and here just doing:

if pathInfo.Until < until {
        until = pathInfo.Until
}

That is, we initialize with the soonest the path may be removed, and go all the way up to it not being removed. So it's not only simpler, but this logic will also mostly survive if we introduce new stages in which files are removed. We just need to be careful to introduce the enums in order of latest survival first. UntilNone is really "keep", per the suggested enum. We might even rename it to make that more clear.

With this, the local type, etc, can go away as well.

Copy link
Collaborator Author

@letFunny letFunny May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made the changes to use setup.Until*, to remove the local type and to make the checks as simple as possible. However, if I am understanding your suggestion correctly, using < to compare setup.Until* seems a little bit brittle because we would be comparing strings present in the YAML in lexicographic order, meaning when we introduce a new value for until: we would have to take that into account.

I have instead kept the manual check but I am happy to discuss it further.

Lastly, when we do the refactor in #134 (comment) we can deduplicate the logic and create a wrapper type and/or a helper function. But for this PR, I am keeping it like this.

if o.Mode.IsDir() {
relPath = relPath + "/"
}
listed := false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/listed/sliceContents/, assuming that's what it means. If it's not, let's please have a variable name that we can remember when we come back here in a few months.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

listed is meant to represent whether the extractInfo is listed explicitly in the contents of some slice (for example copyright has an extractInfo entry but it is not listed anywhere). To reflect that I have changed the name in 3a548f6 to be inSliceContents, is that clear enough?

}
if listed {
untilPaths[relPath] = until
addKnownPath(relPath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this makes me think that we don't need both of these constructs. Every path that is known means it's inside the target and might be read. THe fact it might be read is also the reason why we keep it around for some time, and that time is determined by the until value. Thus, it makes sense to have both of them together, perhaps even exporting them up to small little type so some of the logic in this long function is taken out. If so, we should also move the aggregation between two until values to the function itself, instead of having that logic duplicated in multiple places.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, we will do the refactor but not as part of this PR which is already very long and which you have already spent quite a bit of time reviewing. We will do the necessary changes to polish the code here and keep the refactors for the next PR.

Copy link
Contributor

@niemeyer niemeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes.

} else if pathInfo.Until == setup.UntilNone {
until = keep
if pathInfo.Until == setup.UntilNone {
until = setup.UntilNone
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, this looks a bit different from the suggested. Not a big deal for now as the result is the same, but this version doesn't work once we have more than two options for this value.

@niemeyer niemeyer merged commit a00ec09 into canonical:main Jun 3, 2024
@letFunny letFunny deleted the bug-fix-several-slices-of-same-package branch October 17, 2024 08:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Priority Look at me first

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants