Add experimental cataloger capabilities command #4317

wagoodman · 2025-10-29T18:18:57Z

This adds the ability, when SYFT_EXP_CAPABILITIES=true, to use an internal syft cataloger caps command to describe cataloger capabilities such as:

what catalogers exist
what globs / evidence are searched for
if a cataloger finds licenses
if a cataloger detects particular package manager claims (listing of files, digests, package integrity hash)
what dependencies (if any) can be detected (depth of nodes, topology of the edges, and kinds of dependencies included)
what API and app-level configurations exist for each cataloger

This is available via an ascii table and JSON output.

Caution

This is an experimental feature and can change without warning and could be removed entirely. Do not depend on this command in production.

The way the capabilities are tracked is described in depth in the internal tooling's readme.

A quick summary is that we use the source code and test observations as a basis for what catalogers exist, how they are configured, and what they output. These things are then used to cross-validate a pseudo-generated packages/*.yaml (some auto generated items, some manually filled in items) with a set of completion tests (tests that ensure the full universe of things are defined and self-consistent) and then used to drive the cataloger caps command.

Fixes #4155

Signed-off-by: Alex Goodman <[email protected]>

…apabilities Signed-off-by: Alex Goodman <[email protected]>

Signed-off-by: Alex Goodman <[email protected]>

Signed-off-by: Keith Zantow <[email protected]>

willmurphyscode · 2025-11-14T21:07:31Z

internal/capabilities/packages.yaml

+            default: true
+          - name: dependency.depth
+            default:
+              - direct


I don't see relationships being returned from

syft/syft/pkg/cataloger/lua/parse_rockspec.go

Line 82 in 153f232

return []pkg.Package{p}, nil, nil

so I'm not sure this is correct.

you're right, I think the fact that the parser tracked dependencies when parsing threw me. Technically for the dependencies depth description we only need to look at the list of packages, there don't necessarily need to be relationships... so the fact that this returns a single package is the clue.

…apabilities Signed-off-by: Alex Goodman <[email protected]>

Signed-off-by: Alex Goodman <[email protected]>

Signed-off-by: Keith Zantow <[email protected]>

Signed-off-by: Alex Goodman <[email protected]>

Signed-off-by: Will Murphy <[email protected]>

Signed-off-by: Alex Goodman <[email protected]>

…-capabilities

kzantow

I don't want to hold this up and overall I think this is fine, especially since there are no changes to the external API, so that's great. In terms of the actual configuration files, I think they are ok -- I'd like them to be split up more (one per parser or cataloger entry), and when a new one is generated it would be nice to have empty defaults for all the fields that need to be manually filled out, but this can always be refined later. Two main thoughts:

I still think we should put capabilities files in the same directory as the cataloger, though it would probably take a little weirdness of a top-level embed (e.g. //go:embed syft/pkg/cataloger/*/*.capability.yaml or similar), which calls back to set the results on the internal package, but it could be done without any external API. Having these in separate locations I'm pretty sure is a large part of the confusion for editing them to me -- if they were in the same directory when people edited the catalogers there would be a significantly higher chance that they are updated properly when making changes that affects these files, especially those that aren't detected by the analysis.

The other holdup I have level of integration when there is still a the dichotomy of detected capability configuration vs. required manual configuration. Including this in a test hook for all PRs for what ends up being a moderately small amount of configuration. In the example cataloger, this is the generated amount of configuration, less than half of what I think is needed to be complete. (And in this particular case, it's wrong because of an oddity that required writing tests differently than expected, but that's probably a bug that should be fixed). By integrating this into the PR process, it gives us the maintenance burden of all the generator code -- almost 10,000 fairly complicated (AST parsing, etc.) LoC to generate on the order of 2,500 LoC -- that gives me pause from a maintenance perspective. I wouldn't want to get rid of this, but I wonder if it was a hint rather than an error, would that reduce the required maintenance for something that is likely to change significantly with 2.0? For example: a user adds a new cataloger without any capabilities, the generator detects what it can and gives the user a new file to edit, it doesn't really matter if it's wrong because the user just manually edits it afterwards. I acknowledge the desire to prevent drift, though. But I foresee this causing some friction for reasons I can't put my finger on -- probably not a large amount of friction, though -- moreso some new cataloger needs to do something unique and generation doesn't work quite right, but because it's all internal we can always change it later.

All that said, this is so very useful for many people, and we can iterate on the details later.

cmd/syft/internal/commands/cataloger_caps.go

.github/workflows/validations.yaml

cmd/syft/internal/commands/cataloger.go

internal/capabilities/capabilities.go

Signed-off-by: Alex Goodman <[email protected]>

…apabilities Signed-off-by: Alex Goodman <[email protected]>

Signed-off-by: Alex Goodman <[email protected]>

wagoodman · 2025-12-19T14:48:38Z

I took another shot at moving the cataloger.yaml files, I couldn't easily get it working earlier, but did find a way in the end. It did require introducing an init function to register the capabilities into an internal package, which is not ideal, but it doesn't not introduce a breaking change and does not affect the public API.

wagoodman added 13 commits October 13, 2025 17:14

add info command from generated capabilities

1510db7

Signed-off-by: Alex Goodman <[email protected]>

correct gentoo and arch ecosystems

a92efd5

Signed-off-by: Alex Goodman <[email protected]>

rename os pkg types

02f61ab

Signed-off-by: Alex Goodman <[email protected]>

better binary cataloger description

95ba1b0

Signed-off-by: Alex Goodman <[email protected]>

expose metadata and pacakge types in json

de111f4

Signed-off-by: Alex Goodman <[email protected]>

expose json schema types

63832e5

Signed-off-by: Alex Goodman <[email protected]>

add completeness tests for metadata types

5d182ec

Signed-off-by: Alex Goodman <[email protected]>

latest generation

abfe73b

Signed-off-by: Alex Goodman <[email protected]>

fix linting

0dd906b

Signed-off-by: Alex Goodman <[email protected]>

improve testing a docs

d651245

Signed-off-by: Alex Goodman <[email protected]>

fix tests and linting

16fb680

Signed-off-by: Alex Goodman <[email protected]>

restore goreleaser config

c3e196b

Signed-off-by: Alex Goodman <[email protected]>

Merge remote-tracking branch 'origin/main' into ast-parse-cataloger-c…

4a2d94b

…apabilities Signed-off-by: Alex Goodman <[email protected]>

wagoodman added the changelog-ignore Don't include this issue in the release changelog label Oct 29, 2025

wagoodman mentioned this pull request Oct 29, 2025

Describe cataloger capabilities via test observations #4318

Merged

wagoodman and others added 3 commits October 29, 2025 15:18

tweak diagram

a97e1c6

Signed-off-by: Alex Goodman <[email protected]>

fix pdm

8914996

Signed-off-by: Alex Goodman <[email protected]>

chore: java binary data

725b0df

Signed-off-by: Keith Zantow <[email protected]>

willmurphyscode reviewed Nov 14, 2025

View reviewed changes

github-actions bot added the json-schema Changes the json schema label Nov 14, 2025

wagoodman force-pushed the ast-parse-cataloger-capabilities branch from f41cb60 to 725b0df Compare November 14, 2025 22:35

github-actions bot removed the json-schema Changes the json schema label Nov 14, 2025

wagoodman added 3 commits November 14, 2025 23:27

Merge remote-tracking branch 'origin/main' into ast-parse-cataloger-c…

558983d

…apabilities Signed-off-by: Alex Goodman <[email protected]>

new capability descriptions for gguf and python

9f9170a

Signed-off-by: Alex Goodman <[email protected]>

correct poetry lock integrity hash claim

7e330cd

Signed-off-by: Alex Goodman <[email protected]>

anchore deleted a comment from github-actions bot Nov 15, 2025

wagoodman and others added 4 commits November 14, 2025 23:45

fix compile error

308185c

Signed-off-by: Alex Goodman <[email protected]>

fix: remove purl version from overrides

a6c03e6

Signed-off-by: Keith Zantow <[email protected]>

fix lua deps ref

f6cd3cd

Signed-off-by: Alex Goodman <[email protected]>

keep gguf as ai ecosystem

bb5e221

Signed-off-by: Alex Goodman <[email protected]>

wagoodman and others added 11 commits December 2, 2025 08:46

claim fixtures pre-req for cap generation

6b4a62f

Signed-off-by: Alex Goodman <[email protected]>

update documentation with correct regeneration procedure

52f00fe

Signed-off-by: Alex Goodman <[email protected]>

chore: ruby-gemspec-cataloger finds no dependencies

1f743c8

Signed-off-by: Will Murphy <[email protected]>

chore: fix python docs and config comment

acd07aa

Signed-off-by: Will Murphy <[email protected]>

chore: commit re-generated java yaml

c988229

Signed-off-by: Will Murphy <[email protected]>

add cataloger selection to caps command

078c837

Signed-off-by: Alex Goodman <[email protected]>

re-generate cap yamls

bfa7a0c

Signed-off-by: Alex Goodman <[email protected]>

fix tests for cataloger selection

1f46872

Signed-off-by: Alex Goodman <[email protected]>

fix cli test

3ce7197

Signed-off-by: Alex Goodman <[email protected]>

add missing tests

a2ea01f

Signed-off-by: Alex Goodman <[email protected]>

fix linting

61a3db4

Signed-off-by: Alex Goodman <[email protected]>

wagoodman changed the title ~~Add internal cataloger capability descriptions~~ Add experimental cataloger capabilities command Dec 4, 2025

wagoodman marked this pull request as ready for review December 4, 2025 15:08

kzantow mentioned this pull request Dec 5, 2025

Command output to give more information on what catalogers look for and what they can find #4155

Open

Merge remote-tracking branch 'upstream/main' into ast-parse-cataloger…

9e7b485

…-capabilities

kzantow approved these changes Dec 16, 2025

View reviewed changes

cmd/syft/internal/commands/cataloger_caps.go Outdated Show resolved Hide resolved

.github/workflows/validations.yaml Show resolved Hide resolved

cmd/syft/internal/commands/cataloger.go Show resolved Hide resolved

internal/capabilities/capabilities.go Show resolved Hide resolved

wagoodman self-assigned this Dec 16, 2025

wagoodman added this to OSS Dec 16, 2025

wagoodman moved this to In Progress in OSS Dec 16, 2025

wagoodman moved this from In Progress to In Review in OSS Dec 16, 2025

wagoodman removed this from OSS Dec 16, 2025

wagoodman added 7 commits December 17, 2025 14:52

rename cmd to cataloger info

a32e911

Signed-off-by: Alex Goodman <[email protected]>

Merge remote-tracking branch 'origin/main' into ast-parse-cataloger-c…

e491b80

…apabilities Signed-off-by: Alex Goodman <[email protected]>

[wip] change capability description locations

e2da0b3

Signed-off-by: Alex Goodman <[email protected]>

[wip] continued

c9ea665

Signed-off-by: Alex Goodman <[email protected]>

[wip] adjust for import cycles

b1993eb

Signed-off-by: Alex Goodman <[email protected]>

correct docs

ec33f1a

Signed-off-by: Alex Goodman <[email protected]>

fix linting

175aa13

Signed-off-by: Alex Goodman <[email protected]>

wagoodman enabled auto-merge (squash) December 19, 2025 14:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add experimental cataloger capabilities command #4317

Add experimental cataloger capabilities command #4317

wagoodman commented Oct 29, 2025 •

edited by kzantow

Loading

Uh oh!

willmurphyscode Nov 14, 2025

Uh oh!

wagoodman Nov 15, 2025

Uh oh!

kzantow left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wagoodman commented Dec 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add experimental cataloger capabilities command #4317

Are you sure you want to change the base?

Add experimental cataloger capabilities command #4317

Conversation

wagoodman commented Oct 29, 2025 • edited by kzantow Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

willmurphyscode Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

wagoodman Nov 15, 2025

Choose a reason for hiding this comment

Uh oh!

kzantow left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wagoodman commented Dec 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

wagoodman commented Oct 29, 2025 •

edited by kzantow

Loading

kzantow left a comment •

edited

Loading