-
Notifications
You must be signed in to change notification settings - Fork 748
feat: add SBOM authors flag #4378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Florian Fl Bauer <[email protected]>
Signed-off-by: Florian Fl Bauer <[email protected]>
| flags.StringVarP(&cfg.Source.Supplier, "source-supplier", "", | ||
| "the organization that supplied the component, which often may be the manufacturer, distributor, or repackager") | ||
|
|
||
| flags.StringArrayVarP(&cfg.Source.Authors, "authors", "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay getting back to you with some feedback, due to holidays and time off it took a while to work out some of the details with the team.
I, personally, think this is part of a larger feature: specifying arbitrary SBOM data that Syft doesn't produce. But we've decided that we should probably kick that can down the road until Syft 2.0 if we decide to make changes to the way any of the current --source-name and these additional flags work, but for now adding flags for these purposeful new 2 features makes the most sense.
As for this PR, there are a few things I think we should change, assuming I understand what's being implemented correctly. Am I correct in saying the author is intended to specify the author of the SBOM, rather than author of the source (the source being "thing that was scanned" -- container, directory, etc.)? I'm assuming so, in which case I think Henry's PR is closer to the end result and we should get them aligned by adding the Authors to the SBOM, rather than the source, and similarly use an appropriate flag: --sbom-author.
For both of these PRs I think we should add exported properties directly on the SBOM for Authors and the arbitrary Properties. I don't see any other spot where it makes sense to put them. And because these aren't really involved in cataloging, I don't think it's useful to add cataloging configuration to set these -- it's just unnecessary and possibly confusing indirection. Instead, they should be set directly on the SBOM during decoding by the format decoders, or by the CLI scan command. If we determine it's important to have this in the configuration that gets passed to CreateSBOM, it can be added later, but we can't remove it after we add it.
As far as the CLI goes, instead of piggybacking on the source config, let's introduce a new top-level sbomConfig (similar to the sourceConfig) where the authors and properties will be set to more accurately reflect how this data is used, and it should live alongside the cataloging config here (with the json/yaml/mapstructure key sbom, so syft config makes sense under sbom:) like:
type actor struct {
Type string `json:"type" yaml:"type" mapstructure:"type"`
Name string `json:"name" yaml:"name" mapstructure:"name"`
Email string `json:"email" yaml:"email" mapstructure:"email"`
}
type sbomConfig struct {
Authors []actor `json:"authors" yaml:"authors" mapstructure:"authors"`
authors []string // used for CLI input
Properties map[string]string `json:"properties" yaml:"properties" mapstructure:"properties"`
properties []string // used for CLI input
}
func (c *sbomConfig) PostLoad() error {
// to support env vars e.g. SYFT_SBOM_AUTHOR_NAME, we need custom env var lookups see: https://github.com/anchore/syft/blob/main/cmd/syft/internal/options/registry.go#L35
// or maybe SYFT_SBOM_AUTHORS using a flattened name=value list as described below
// do the custom CLI parsing of strings here, we could accept JSON in addition to any more bespoke format
}
By having this configuration structure, the yaml configuration will work as expected, so the last remaining bit is exactly how this gets specified for the flags. I've given a suggestion above, keeping the bespoke name=value that we've adopted elsewhere and for the properties. I think we should be careful about introducing bespoke parsing, like the <type>:<name>:<value> -- the only other spot that I'm aware we use : is in a URI-like spot, where the preceding value is the scheme, whereas this is just a way to split values. Maybe think about adopting a name=value like we use in other spots and the properties also should probably use, e.g. --sbom-author type=person,name=TheName,email=e@mail (though using a comma is another pattern we have that is equivalent to multiple flags, e.g. --from docker,registry == --from docker --from registry, and using the Flatten function, it could also make properties easier e.g. --sbom-property name1=value1,name2=value2, so consider a different separator character. Sorry I don't have a great answer here, but we can finalize this detail fairly quickly with the whole team towards the end; I would lean towards &, but it needs to be escaped in shells, so maybe +?, definitely open to suggestions here).
Again, apologies for the delay and I hope this is understandable enough!
Description
This PR implements the
--authorsflag to allow users to specify custom SBOM authors for all output formats (SPDX, CycloneDX, and Syft-JSON). This addresses the need for organizations to identify themselves and their tools in generated SBOMs, which is important for SBOM provenance and tooling attribution.Implementation Details
--authorsflag that accepts multiple author entries in the formattype:name:emailPerson,Organization,Tool--authors "Person:John Doe:[email protected]"Examples
Default behavior (SPDX only has default authors):
With custom authors:
Question
I noticed that only the SPDX format has default authors (
Organization: Anchore, IncandTool: syft-[version]), while CycloneDX and Syft-JSON formats have no default authors. Is this intentional behavior? Should all formats have consistent defaults, or is there a specification reason for this difference?Also only SPDX expects to add an type to the creators info: https://spdx.github.io/spdx-spec/v2.3/document-creation-information/ should we also cover this?
Type of change
Checklist: