-
Couldn't load subscription status.
- Fork 727
Add support for searching for jars within archives #734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Alex Goodman <[email protected]>
Benchmark Test ResultsBenchmark results from the latest changes vs base branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small comments - Awesome work @wagoodman it really is easier to read this process and adds a great feature that really levels up the Java Cataloger
| "**/*.tar.zst", | ||
| } | ||
|
|
||
| // TODO: when the generic archive cataloger is implemented, this should be removed (https://github.com/anchore/syft/issues/246) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So rather than hook these as functions into glob matching under the java cataloger you want a generic archive cataloger that can then delegate discovered packages to their respective languages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBD, but that is one possible approach. It may be that the source.FileResolver implementations take care of this detail for us.
|
FWIW I ran this in comparison to the log4j-sniffer and here are the results, showing this PR does make Syft find the same things (there are some expected misses, like the Here's the output from each: log4j-sniffer outputkzantow@KZANTOW log4j-sniffer % for d in examples/*/ ; do echo ; echo scan $d ; go run main.go crawl $d ; done
scan examples/archived_fat_jar/
No files affected by CVE-2021-45046 or CVE-2021-45105 detected
1 total files scanned, skipped 0 paths due to permission denied errors, encountered 0 errors processing paths
scan examples/cve-2021-45105-versions/
[INFO] Found archive with name matching vulnerable log4j-core format at examples/cve-2021-45105-versions/log4j-core-2.12.2.jar
[INFO] Found JndiLookup class in the log4j package at org/apache/logging/log4j/core/lookup/JndiLookup.class
[INFO] Found JndiManager class that was an exact md5 match for a known version at org/apache/logging/log4j/core/net/JndiManager.class
[MATCH] CVE-2021-45105 detected in file examples/cve-2021-45105-versions/log4j-core-2.12.2.jar. log4j versions: 2.12.2. Reasons: JndiLookup class and package name matched, jar name matched, JndiManager class and package name matched, class file MD5 matched
[INFO] Found archive with name matching vulnerable log4j-core format at examples/cve-2021-45105-versions/log4j-core-2.16.0.jar
[INFO] Found JndiManager class that was an exact md5 match for a known version at org/apache/logging/log4j/core/net/JndiManager.class
[INFO] Found JndiLookup class in the log4j package at org/apache/logging/log4j/core/lookup/JndiLookup.class
[MATCH] CVE-2021-45105 detected in file examples/cve-2021-45105-versions/log4j-core-2.16.0.jar. log4j versions: 2.16.0. Reasons: JndiLookup class and package name matched, jar name matched, JndiManager class and package name matched, class file MD5 matched
Files affected by CVE-2021-45046 or CVE-2021-45105 detected: 2 file(s) impacted by CVE-2021-45046 or CVE-2021-45105
2 total files scanned, skipped 0 paths due to permission denied errors, encountered 0 errors processing paths
scan examples/fat_jar/
[INFO] Found JndiManager class that was an exact md5 match for a known version at org/apache/logging/log4j/core/net/JndiManager.class
[INFO] Found JndiLookup class in the log4j package at org/apache/logging/log4j/core/lookup/JndiLookup.class
[MATCH] CVE-2021-45046 and CVE-2021-45105 detected in file examples/fat_jar/fat_jar.jar. log4j versions: 2.14.0-2.14.1. Reasons: JndiLookup class and package name matched, JndiManager class and package name matched, class file MD5 matched
Files affected by CVE-2021-45046 or CVE-2021-45105 detected: 1 file(s) impacted by CVE-2021-45046 or CVE-2021-45105
1 total files scanned, skipped 0 paths due to permission denied errors, encountered 0 errors processing paths
scan examples/good_version/
[INFO] Found JndiLookup class in the log4j package at org/apache/logging/log4j/core/lookup/JndiLookup.class
[INFO] Found JndiManager class that was an exact md5 match for a known version at org/apache/logging/log4j/core/net/JndiManager.class
No files affected by CVE-2021-45046 or CVE-2021-45105 detected
1 total files scanned, skipped 0 paths due to permission denied errors, encountered 0 errors processing paths
scan examples/inside_a_dist/
[INFO] Found nesting archive matching the log4j-core jar name at log4j-core-2.14.1.jar
[MATCH] CVE-2021-45046 and CVE-2021-45105 detected in file examples/inside_a_dist/wrapped_log4j.tar. log4j versions: 2.14.1. Reasons: jar name inside archive matched
[INFO] Found nesting archive matching the log4j-core jar name at log4j-core-2.14.1.jar
[MATCH] CVE-2021-45046 and CVE-2021-45105 detected in file examples/inside_a_dist/wrapped_log4j.tar.bz2. log4j versions: 2.14.1. Reasons: jar name inside archive matched
[INFO] Found nesting archive matching the log4j-core jar name at log4j-core-2.14.1.jar
[MATCH] CVE-2021-45046 and CVE-2021-45105 detected in file examples/inside_a_dist/wrapped_log4j.tar.gz. log4j versions: 2.14.1. Reasons: jar name inside archive matched
[INFO] Found nesting archive matching the log4j-core jar name at log4j-core-2.14.1.jar
[MATCH] CVE-2021-45046 and CVE-2021-45105 detected in file examples/inside_a_dist/wrapped_log4j.zip. log4j versions: 2.14.1. Reasons: jar name inside archive matched
Files affected by CVE-2021-45046 or CVE-2021-45105 detected: 4 file(s) impacted by CVE-2021-45046 or CVE-2021-45105
4 total files scanned, skipped 0 paths due to permission denied errors, encountered 0 errors processing paths
scan examples/inside_a_par/
[INFO] Found nesting archive matching the log4j-core jar name at lib/log4j-core-2.14.1.jar
[MATCH] CVE-2021-45046 and CVE-2021-45105 detected in file examples/inside_a_par/wrapped_in_a_par.par. log4j versions: 2.14.1. Reasons: jar name inside archive matched
Files affected by CVE-2021-45046 or CVE-2021-45105 detected: 1 file(s) impacted by CVE-2021-45046 or CVE-2021-45105
1 total files scanned, skipped 0 paths due to permission denied errors, encountered 0 errors processing paths
scan examples/java_projects/
No files affected by CVE-2021-45046 or CVE-2021-45105 detected
10 total files scanned, skipped 0 paths due to permission denied errors, encountered 0 errors processing paths
scan examples/light_shading/
[INFO] Found JndiManager class not in the log4j package at relocated/core/net/JndiManager.class
[INFO] Found JndiManager class that had identical bytecode instruction as a known version at relocated/core/net/JndiManager.class
[INFO] Found JndiLookup class not in the log4j package at relocated/core/lookup/JndiLookup.class
[MATCH] CVE-2021-45046 and CVE-2021-45105 detected in file examples/light_shading/shadow-all.jar. log4j versions: 2.12.0-2.14.1. Reasons: JndiLookup class name matched, JndiManager class name matched, byte code instruction MD5 matched
Files affected by CVE-2021-45046 or CVE-2021-45105 detected: 1 file(s) impacted by CVE-2021-45046 or CVE-2021-45105
1 total files scanned, skipped 0 paths due to permission denied errors, encountered 0 errors processing paths
scan examples/multiple_bad_versions/
[INFO] Found archive with name matching vulnerable log4j-core format at examples/multiple_bad_versions/log4j-core-2.10.0.jar
[INFO] Found JndiLookup class in the log4j package at org/apache/logging/log4j/core/lookup/JndiLookup.class
[INFO] Found JndiManager class that was an exact md5 match for a known version at org/apache/logging/log4j/core/net/JndiManager.class
[MATCH] CVE-2021-45046 and CVE-2021-45105 detected in file examples/multiple_bad_versions/log4j-core-2.10.0.jar. log4j versions: 2.10.0, 2.9.0-2.11.2. Reasons: JndiLookup class and package name matched, jar name matched, JndiManager class and package name matched, class file MD5 matched
[INFO] Found archive with name matching vulnerable log4j-core format at examples/multiple_bad_versions/log4j-core-2.11.0.jar
[INFO] Found JndiLookup class in the log4j package at org/apache/logging/log4j/core/lookup/JndiLookup.class
[INFO] Found JndiManager class that was an exact md5 match for a known version at org/apache/logging/log4j/core/net/JndiManager.class
[MATCH] CVE-2021-45046 and CVE-2021-45105 detected in file examples/multiple_bad_versions/log4j-core-2.11.0.jar. log4j versions: 2.11.0, 2.9.0-2.11.2. Reasons: JndiLookup class and package name matched, jar name matched, JndiManager class and package name matched, class file MD5 matched
[INFO] Found archive with name matching vulnerable log4j-core format at examples/multiple_bad_versions/log4j-core-2.11.1.jar
[INFO] Found JndiManager class that was an exact md5 match for a known version at org/apache/logging/log4j/core/net/JndiManager.class
[INFO] Found JndiLookup class in the log4j package at org/apache/logging/log4j/core/lookup/JndiLookup.class
[MATCH] CVE-2021-45046 and CVE-2021-45105 detected in file examples/multiple_bad_versions/log4j-core-2.11.1.jar. log4j versions: 2.11.1, 2.9.0-2.11.2. Reasons: JndiLookup class and package name matched, jar name matched, JndiManager class and package name matched, class file MD5 matched
[INFO] Found archive with name matching vulnerable log4j-core format at examples/multiple_bad_versions/log4j-core-2.11.2.jar
[INFO] Found JndiLookup class in the log4j package at org/apache/logging/log4j/core/lookup/JndiLookup.class
[INFO] Found JndiManager class that was an exact md5 match for a known version at org/apache/logging/log4j/core/net/JndiManager.class
[MATCH] CVE-2021-45046 and CVE-2021-45105 detected in file examples/multiple_bad_versions/log4j-core-2.11.2.jar. log4j versions: 2.11.2, 2.9.0-2.11.2. Reasons: JndiLookup class and package name matched, jar name matched, JndiManager class and package name matched, class file MD5 matched
[INFO] Found archive with name matching vulnerable log4j-core format at examples/multiple_bad_versions/log4j-core-2.12.0.jar
[INFO] Found JndiLookup class in the log4j package at org/apache/logging/log4j/core/lookup/JndiLookup.class
[INFO] Found JndiManager class that was an exact md5 match for a known version at org/apache/logging/log4j/core/net/JndiManager.class
[MATCH] CVE-2021-45046 and CVE-2021-45105 detected in file examples/multiple_bad_versions/log4j-core-2.12.0.jar. log4j versions: 2.12.0. Reasons: JndiLookup class and package name matched, jar name matched, JndiManager class and package name matched, class file MD5 matched
[INFO] Found archive with name matching vulnerable log4j-core format at examples/multiple_bad_versions/log4j-core-2.12.1.jar
[INFO] Found JndiLookup class in the log4j package at org/apache/logging/log4j/core/lookup/JndiLookup.class
[INFO] Found JndiManager class that was an exact md5 match for a known version at org/apache/logging/log4j/core/net/JndiManager.class
[MATCH] CVE-2021-45046 and CVE-2021-45105 detected in file examples/multiple_bad_versions/log4j-core-2.12.1.jar. log4j versions: 2.12.0, 2.12.1. Reasons: JndiLookup class and package name matched, jar name matched, JndiManager class and package name matched, class file MD5 matched
[INFO] Found archive with name matching vulnerable log4j-core format at examples/multiple_bad_versions/log4j-core-2.13.0.jar
[INFO] Found JndiManager class that was an exact md5 match for a known version at org/apache/logging/log4j/core/net/JndiManager.class
[INFO] Found JndiLookup class in the log4j package at org/apache/logging/log4j/core/lookup/JndiLookup.class
[MATCH] CVE-2021-45046 and CVE-2021-45105 detected in file examples/multiple_bad_versions/log4j-core-2.13.0.jar. log4j versions: 2.13.0, 2.13.0-2.13.3. Reasons: JndiLookup class and package name matched, jar name matched, JndiManager class and package name matched, class file MD5 matched
[INFO] Found archive with name matching vulnerable log4j-core format at examples/multiple_bad_versions/log4j-core-2.13.1.jar
[INFO] Found JndiLookup class in the log4j package at org/apache/logging/log4j/core/lookup/JndiLookup.class
[INFO] Found JndiManager class that was an exact md5 match for a known version at org/apache/logging/log4j/core/net/JndiManager.class
[MATCH] CVE-2021-45046 and CVE-2021-45105 detected in file examples/multiple_bad_versions/log4j-core-2.13.1.jar. log4j versions: 2.13.0-2.13.3, 2.13.1. Reasons: JndiLookup class and package name matched, jar name matched, JndiManager class and package name matched, class file MD5 matched
[INFO] Found archive with name matching vulnerable log4j-core format at examples/multiple_bad_versions/log4j-core-2.13.2.jar
[INFO] Found JndiLookup class in the log4j package at org/apache/logging/log4j/core/lookup/JndiLookup.class
[INFO] Found JndiManager class that was an exact md5 match for a known version at org/apache/logging/log4j/core/net/JndiManager.class
[MATCH] CVE-2021-45046 and CVE-2021-45105 detected in file examples/multiple_bad_versions/log4j-core-2.13.2.jar. log4j versions: 2.13.0-2.13.3, 2.13.2. Reasons: JndiLookup class and package name matched, jar name matched, JndiManager class and package name matched, class file MD5 matched
[INFO] Found archive with name matching vulnerable log4j-core format at examples/multiple_bad_versions/log4j-core-2.13.3.jar
[INFO] Found JndiLookup class in the log4j package at org/apache/logging/log4j/core/lookup/JndiLookup.class
[INFO] Found JndiManager class that was an exact md5 match for a known version at org/apache/logging/log4j/core/net/JndiManager.class
[MATCH] CVE-2021-45046 and CVE-2021-45105 detected in file examples/multiple_bad_versions/log4j-core-2.13.3.jar. log4j versions: 2.13.0-2.13.3, 2.13.3. Reasons: JndiLookup class and package name matched, jar name matched, JndiManager class and package name matched, class file MD5 matched
[INFO] Found archive with name matching vulnerable log4j-core format at examples/multiple_bad_versions/log4j-core-2.14.0.jar
[INFO] Found JndiManager class that was an exact md5 match for a known version at org/apache/logging/log4j/core/net/JndiManager.class
[INFO] Found JndiLookup class in the log4j package at org/apache/logging/log4j/core/lookup/JndiLookup.class
[MATCH] CVE-2021-45046 and CVE-2021-45105 detected in file examples/multiple_bad_versions/log4j-core-2.14.0.jar. log4j versions: 2.14.0, 2.14.0-2.14.1. Reasons: JndiLookup class and package name matched, jar name matched, JndiManager class and package name matched, class file MD5 matched
[INFO] Found archive with name matching vulnerable log4j-core format at examples/multiple_bad_versions/log4j-core-2.14.1.jar
[INFO] Found JndiManager class that was an exact md5 match for a known version at org/apache/logging/log4j/core/net/JndiManager.class
[INFO] Found JndiLookup class in the log4j package at org/apache/logging/log4j/core/lookup/JndiLookup.class
[MATCH] CVE-2021-45046 and CVE-2021-45105 detected in file examples/multiple_bad_versions/log4j-core-2.14.1.jar. log4j versions: 2.14.0-2.14.1, 2.14.1. Reasons: JndiLookup class and package name matched, jar name matched, JndiManager class and package name matched, class file MD5 matched
[INFO] Found archive with name matching vulnerable log4j-core format at examples/multiple_bad_versions/log4j-core-2.15.0.jar
[INFO] Found JndiManager class that was an exact md5 match for a known version at org/apache/logging/log4j/core/net/JndiManager.class
[INFO] Found JndiLookup class in the log4j package at org/apache/logging/log4j/core/lookup/JndiLookup.class
[MATCH] CVE-2021-45046 and CVE-2021-45105 detected in file examples/multiple_bad_versions/log4j-core-2.15.0.jar. log4j versions: 2.15.0. Reasons: JndiLookup class and package name matched, jar name matched, JndiManager class and package name matched, class file MD5 matched
Files affected by CVE-2021-45046 or CVE-2021-45105 detected: 13 file(s) impacted by CVE-2021-45046 or CVE-2021-45105
13 total files scanned, skipped 0 paths due to permission denied errors, encountered 0 errors processing paths
scan examples/obfuscated/
[INFO] Found JndiManager class that partially matched the bytecode of a known version at org/a/a/a/a/g/b.class
[INFO] Found finding in what appeared to be an obfuscated jar at examples/obfuscated/2.14.1-aaaagb.jar
[MATCH] CVE-2021-45105 detected in file examples/obfuscated/2.14.1-aaaagb.jar. log4j versions: 2.12.2. Reasons: jar file appeared obfuscated, byte code partially matched known version
Files affected by CVE-2021-45046 or CVE-2021-45105 detected: 1 file(s) impacted by CVE-2021-45046 or CVE-2021-45105
1 total files scanned, skipped 0 paths due to permission denied errors, encountered 0 errors processing paths
scan examples/par_in_a_dist/
No files affected by CVE-2021-45046 or CVE-2021-45105 detected
1 total files scanned, skipped 0 paths due to permission denied errors, encountered 0 errors processing paths
scan examples/single_bad_version/
[INFO] Found archive with name matching vulnerable log4j-core format at examples/single_bad_version/log4j-core-2.14.1.jar
[INFO] Found JndiManager class that was an exact md5 match for a known version at org/apache/logging/log4j/core/net/JndiManager.class
[INFO] Found JndiLookup class in the log4j package at org/apache/logging/log4j/core/lookup/JndiLookup.class
[MATCH] CVE-2021-45046 and CVE-2021-45105 detected in file examples/single_bad_version/log4j-core-2.14.1.jar. log4j versions: 2.14.0-2.14.1, 2.14.1. Reasons: JndiLookup class and package name matched, jar name matched, JndiManager class and package name matched, class file MD5 matched
Files affected by CVE-2021-45046 or CVE-2021-45105 detected: 1 file(s) impacted by CVE-2021-45046 or CVE-2021-45105
1 total files scanned, skipped 0 paths due to permission denied errors, encountered 0 errors processing paths
kzantow@KZANTOW log4j-sniffer % Syft using this PRkzantow@KZANTOW syft % for d in ../log4j-sniffer/examples/*/; do echo ; echo scan $d ; go run main.go dir:$d; done
scan ../log4j-sniffer/examples/archived_fat_jar/
✔ Indexed ../log4j-sniffer/examples/archived_fat_jar/
✔ Cataloged packages [2 packages]
NAME VERSION TYPE
fat_jar 2.14.1 java-archive
log4j-core 2.14.1 java-archive
scan ../log4j-sniffer/examples/cve-2021-45105-versions/
✔ Indexed ../log4j-sniffer/examples/cve-2021-45105-versions/
✔ Cataloged packages [2 packages]
NAME VERSION TYPE
log4j-core 2.12.2 java-archive
log4j-core 2.16.0 java-archive
scan ../log4j-sniffer/examples/fat_jar/
✔ Indexed ../log4j-sniffer/examples/fat_jar/
✔ Cataloged packages [2 packages]
NAME VERSION TYPE
fat_jar 2.14.1 java-archive
log4j-core 2.14.1 java-archive
scan ../log4j-sniffer/examples/good_version/
✔ Indexed ../log4j-sniffer/examples/good_version/
✔ Cataloged packages [1 packages]
NAME VERSION TYPE
log4j-core 2.17.0 java-archive
scan ../log4j-sniffer/examples/inside_a_dist/
✔ Indexed ../log4j-sniffer/examples/inside_a_dist/
✔ Cataloged packages [4 packages]
NAME VERSION TYPE
log4j-core 2.14.1 java-archive
scan ../log4j-sniffer/examples/inside_a_par/
✔ Indexed ../log4j-sniffer/examples/inside_a_par/
✔ Cataloged packages [1 packages]
NAME VERSION TYPE
log4j-core 2.14.1 java-archive
scan ../log4j-sniffer/examples/java_projects/
✔ Indexed ../log4j-sniffer/examples/java_projects/
✔ Cataloged packages [1 packages]
NAME VERSION TYPE
gradle-wrapper java-archive
scan ../log4j-sniffer/examples/light_shading/
✔ Indexed ../log4j-sniffer/examples/light_shading/
✔ Cataloged packages [9 packages]
NAME VERSION TYPE
error_prone_annotations 2.5.1 java-archive
failureaccess 1.0.1 java-archive
guava 30.1.1-jre java-archive
j2objc-annotations 1.3 java-archive
jsr305 3.0.2 java-archive
listenablefuture 9999.0-empty-to-avoid-conflict-with-guava java-archive
log4j-api 2.14.1 java-archive
log4j-core 2.14.1 java-archive
shadow-all java-archive
scan ../log4j-sniffer/examples/multiple_bad_versions/
✔ Indexed ../log4j-sniffer/examples/multiple_bad_versions/
✔ Cataloged packages [13 packages]
NAME VERSION TYPE
log4j-core 2.10.0 java-archive
log4j-core 2.11.0 java-archive
log4j-core 2.11.1 java-archive
log4j-core 2.11.2 java-archive
log4j-core 2.12.0 java-archive
log4j-core 2.12.1 java-archive
log4j-core 2.13.0 java-archive
log4j-core 2.13.1 java-archive
log4j-core 2.13.2 java-archive
log4j-core 2.13.3 java-archive
log4j-core 2.14.0 java-archive
log4j-core 2.14.1 java-archive
log4j-core 2.15.0 java-archive
scan ../log4j-sniffer/examples/obfuscated/
✔ Indexed ../log4j-sniffer/examples/obfuscated/
✔ Cataloged packages [8 packages]
NAME VERSION TYPE
error_prone_annotations 2.5.1 java-archive
failureaccess 1.0.1 java-archive
guava 30.1.1-jre java-archive
j2objc-annotations 1.3 java-archive
jsr305 3.0.2 java-archive
listenablefuture 9999.0-empty-to-avoid-conflict-with-guava java-archive
log4j-api 2.14.1 java-archive
log4j-core 2.14.1 java-archive
scan ../log4j-sniffer/examples/par_in_a_dist/
✔ Indexed ../log4j-sniffer/examples/par_in_a_dist/
✔ Cataloged packages [1 packages]
NAME VERSION TYPE
log4j-core 2.14.1 java-archive
scan ../log4j-sniffer/examples/single_bad_version/
✔ Indexed ../log4j-sniffer/examples/single_bad_version/
✔ Cataloged packages [1 packages]
NAME VERSION TYPE
log4j-core 2.14.1 java-archive
kzantow@KZANTOW syft % |
internal/file/tar_file_traversal.go
Outdated
| defer tempFile.Close() | ||
|
|
||
| // limit the zip reader on each file read to prevent decompression bomb attacks | ||
| numBytes, err := io.Copy(tempFile, io.LimitReader(file.ReadCloser, perFileReadLimit)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where does perFileReadLimit come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the reason for limiting file reads is to thwart several cases for decompression bomb attacks, though the limit itself is arbitrary. This is a per-file within the archive limit, not a limit on the size of the archive. We do this same approach for other zip utilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though, this seems to have been copy and pasted a few times in the code, let me extract it into a helper function.
syft/pkg/cataloger/java/cataloger.go
Outdated
| } | ||
|
|
||
| // java archives wrapped within tar files | ||
| for _, pattern := range genericTarGlobs { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't wait to see the pattern for making this configurable!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just added in f3ffaed ! 🎉 🌮
| "**/*.tar.zst", | ||
| } | ||
|
|
||
| // TODO: when the generic archive cataloger is implemented, this should be removed (https://github.com/anchore/syft/issues/246) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#246 sounds great; will make Syft a ton better, I think!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fingers crossed on implementation details and performance costs, but it would be a super awesome feature. I also see that we may be able to use some of the archiver/v4 features that abstract archives to the fs.FS abstraction (still alpha). May be interesting to checkout when we get to #246 .
| // integrity check | ||
| var _ common.ParserFn = parseZipWrappedJavaArchive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it isn't providing any functionality, but is instead a hint to developers working in this file that this function is meant to keep the same contract as a common.ParserFn.
Signed-off-by: Alex Goodman <[email protected]>
Signed-off-by: Alex Goodman <[email protected]>
Signed-off-by: Alex Goodman <[email protected]>
Signed-off-by: Alex Goodman <[email protected]>
Signed-off-by: Alex Goodman <[email protected]>
|
@wagoodman Approved! Great changes and really awesome work here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good; some comments that aren't really things I would consider need addressing and I re-ran this against the sample data we had and it works as expected 👍
| deb.NewDpkgdbCataloger(), | ||
| rpmdb.NewRpmdbCataloger(), | ||
| java.NewJavaCataloger(), | ||
| java.NewJavaCataloger(cfg.Java()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would a better pattern maybe be passing the entire cfg to each cataloger that needs it as some might share the same configuration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I toyed with that idea but leaned towards the direction that focused the concerns for each cataloger. Technically each cataloger should be able to describe exactly what it needs in a small configuration which callers can use (in this case, the java cataloger package having its own Config description). The downside of the other direction is that the java package would be depending on definitions from the cataloger package, which would lead to a circular dependency.
| return java.Config{ | ||
| SearchUnindexedArchives: c.Search.IncludeUnindexedArchives, | ||
| SearchIndexedArchives: c.Search.IncludeIndexedArchives, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
although this PR is java specific, these would be common variables for #246 probably, would you say?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I'm thinking may be the case as well.
I had the config item as packages.java.search-unindex-archives but changed it to packages.search-unindex-archives in case we add this functionality to other catalogers or full up include this as a configurable to #246 . Also doesn't hurt #246 if different configurables are needed.
Signed-off-by: Alex Goodman <[email protected]>
Signed-off-by: Alex Goodman <[email protected]>
…hub.com/hectorj2f/syft into hectorj2f/add_dependencies_to_cyclonedx * 'hectorj2f/add_dependencies_to_cyclonedx' of https://github.com/hectorj2f/syft: (29 commits) Improve CycloneDX format output (#710) Add additional PHP metadata (#753) Update Syft formats for SyftJson (#752) Add support for "file" source type in syftjson unmarshaling (#750) remove contains file from spdx dependency generation support .sar for java ecosystem (#748) Start developer documentation (#746) Align SPDX export more with SPDX 2.2 specification (#743) Replace distro type (#742) update goreleaser with windows checksums (#740) bump stereoscope version to remove old containerd (#741) Add support for multiple output files in different formats (#732) Add support for searching for jars within archives (#734) 683 windows filepath (#735) Fix CPE encode/decode when it contains special chars (#714) support .par for java ecosystems (#727) Add arm64 support to install script (#729) Revert "bump goreleaser to v1.2 (#720)" (#731) Add a version flag (#722) Add lpkg as java package format (#694) ...
* add support for searching jars within archives Signed-off-by: Alex Goodman <[email protected]> * add package cataloger config options Signed-off-by: Alex Goodman <[email protected]> * address review comments + factor out safeCopy helper Signed-off-by: Alex Goodman <[email protected]> * update config docs regarding package archive search options Signed-off-by: Alex Goodman <[email protected]> * show that unindexed archive cataloging defaults to false Signed-off-by: Alex Goodman <[email protected]> * remove lies about -s Signed-off-by: Alex Goodman <[email protected]> * address review comments Signed-off-by: Alex Goodman <[email protected]> * update search archive note about java Signed-off-by: Alex Goodman <[email protected]> Signed-off-by: fsl <[email protected]>
* add support for searching jars within archives Signed-off-by: Alex Goodman <[email protected]> * add package cataloger config options Signed-off-by: Alex Goodman <[email protected]> * address review comments + factor out safeCopy helper Signed-off-by: Alex Goodman <[email protected]> * update config docs regarding package archive search options Signed-off-by: Alex Goodman <[email protected]> * show that unindexed archive cataloging defaults to false Signed-off-by: Alex Goodman <[email protected]> * remove lies about -s Signed-off-by: Alex Goodman <[email protected]> * address review comments Signed-off-by: Alex Goodman <[email protected]> * update search archive note about java Signed-off-by: Alex Goodman <[email protected]> Signed-off-by: Christopher Phillips <[email protected]>
* add support for searching jars within archives Signed-off-by: Alex Goodman <[email protected]> * add package cataloger config options Signed-off-by: Alex Goodman <[email protected]> * address review comments + factor out safeCopy helper Signed-off-by: Alex Goodman <[email protected]> * update config docs regarding package archive search options Signed-off-by: Alex Goodman <[email protected]> * show that unindexed archive cataloging defaults to false Signed-off-by: Alex Goodman <[email protected]> * remove lies about -s Signed-off-by: Alex Goodman <[email protected]> * address review comments Signed-off-by: Alex Goodman <[email protected]> * update search archive note about java Signed-off-by: Alex Goodman <[email protected]>
* add support for searching jars within archives Signed-off-by: Alex Goodman <[email protected]> * add package cataloger config options Signed-off-by: Alex Goodman <[email protected]> * address review comments + factor out safeCopy helper Signed-off-by: Alex Goodman <[email protected]> * update config docs regarding package archive search options Signed-off-by: Alex Goodman <[email protected]> * show that unindexed archive cataloging defaults to false Signed-off-by: Alex Goodman <[email protected]> * remove lies about -s Signed-off-by: Alex Goodman <[email protected]> * address review comments Signed-off-by: Alex Goodman <[email protected]> * update search archive note about java Signed-off-by: Alex Goodman <[email protected]>
* add support for searching jars within archives Signed-off-by: Alex Goodman <[email protected]> * add package cataloger config options Signed-off-by: Alex Goodman <[email protected]> * address review comments + factor out safeCopy helper Signed-off-by: Alex Goodman <[email protected]> * update config docs regarding package archive search options Signed-off-by: Alex Goodman <[email protected]> * show that unindexed archive cataloging defaults to false Signed-off-by: Alex Goodman <[email protected]> * remove lies about -s Signed-off-by: Alex Goodman <[email protected]> * address review comments Signed-off-by: Alex Goodman <[email protected]> * update search archive note about java Signed-off-by: Alex Goodman <[email protected]>
Today we search for jars as well as jars within jars. This PR additionally adds support for looking for jars within archives (zip, tar, and gzipped/bzip2/brotli/lz4/sz/xz/zst compressed tars). This is done by explicitly looking for archives with supported extensions and looking at the file listing for each archive to determine if there are any java archives that should be extracted (jar, war, ear, hpi, jpi, etc...) and extracts them if they exist.
This does not recursively open zips/tars. A jar found within a zip/tar is recursively searched for additional jars.
Zip functionality has been enabled by default while tar functionality remains behind a new configurable option (see the performance note below).
Performance note
Zip files have a central header which can be used to do this file-listing search efficiently, however, tar does not support this feature. In order to search the file-listing we must decompress the entire archive contents. This will lead to longer scanning times for images or directories that have many small archives (or a few large archives).
Take the given example image:
zipsupport onlytarsupport onlyzip+tarsupportNote that tar-related searches resulted in +10 seconds of searching while zip was negligible
To illustrate this on a real image (
gitlab/gitlab:latest):zipsupport onlytarsupport onlyzip+tarsupportNote that tar-related searches resulted in +30 seconds of searching while zip was negligible
This can be a problem for images that have a lot of
tar.gzsources, which is fairly common, as the implication is that we must decompress all tar.gz files regardless if that is useful for discovering packages. For this reason two configurables have been added:Note that
search-unindexed-archivesdefaults tofalseto mitigate performance concerns for the common user (indexing within tars is opt-in).Note for future implementations
Upon implementing #246 we should remove these parser functions entirely and lean on the archive cataloger to discover packages of any kind within archives.