Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Order CPEs deterministically for SBOM reproducibility #2967

@luhring

Description

@luhring

What happened:

I'm seeing nondeterministic behavior when using Syft as a library (in wolfictl) to generate SBOMs. I noticed this via new golden-file style tests we've introduced, to ensure we get the same output for the same input. For a couple of the test targets (which are each APK files), a test will fail on the next run immediately following that test's golden file update.

I'm not 100% sure this is Syft's fault yet, since there's wrapping code in wolfictl involved, too. But wanted to flag the issue here at least so we can discuss!

Here are some example diffs from two consecutive runs of the SBOM generation code under this test:

For jenkins-2.461-r0.apk

 "language": "java",
 "cpes": [
   {
-    "cpe": "cpe:2.3:a:jenkins:mailer:472.vf7c289a_4b_420:*:*:*:*:jenkins:*:*",
-    "source": "nvd-cpe-dictionary"
-  },
-  {
-    "cpe": "cpe:2.3:a:jenkins:mailer:472.vf7c289a_4b_420:*:*:*:*:*:*:*",
+    "cpe": "cpe:2.3:a:jenkins:mailer:472.vf7c289a_4b_420:*:*:*:*:*:*:*",
+    "source": "nvd-cpe-dictionary"
+  },
+  {
+    "cpe": "cpe:2.3:a:jenkins:mailer:472.vf7c289a_4b_420:*:*:*:*:jenkins:*:*",
     "source": "nvd-cpe-dictionary"

For jruby-9.4-9.4.7.0-r0.apk:

 {
-"id": "b18c20e1cb65977c",
+"id": "1ac1bd89c6841000",
 "name": "jruby-base",
 "version": "9.4.7.0",
 "type": "java-archive",
 ...
     }
   }
 ],
-"licenses": [],
+"licenses": [
+  {
+    "value": "Apache-2.0",
+    "spdxExpression": "Apache-2.0",
+    "type": "concluded",
+    "urls": [],
+    "locations": [
+      {
+        "path": "usr/share/jruby/lib/jruby.jar",
+        "accessPath": "usr/share/jruby/lib/jruby.jar",
+        "annotations": {
+          "evidence": "primary"
+        }
+      }
+    ]
+  }
+],

What you expected to happen:

Same exact output given same input!

Steps to reproduce the issue:

Check out https://github.com/wolfi-dev/wolfictl and run the test linked above. Note that you may have to run the test multiple times in order to get a complete sense of the results that code can produce. Also note that the first run of the test is doing a fetch of several APKs, so it will take considerably more time than subsequent test runs.

Anything else we need to know?:

So far the only test cases exhibiting this behavior are Java-based packages... 🤔

cc: @wagoodman, this is the thing we talked about briefly last week.

Environment:

  • Output of syft version:
$ go list -m all | grep syft           
github.com/anchore/syft v1.7.0
  • OS (e.g: cat /etc/os-release or similar): latest macOS

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions