[ZipManager] Add ZIP64 (files > 4GB) support.#27692
Conversation
|
@claude review |
|
Claude finished @78andyp's task in 1m 10s —— View job Code Review
OverviewThis PR successfully adds ZIP64 support to handle files larger than 4GB. The implementation follows the ZIP64 specification and properly upgrades 32-bit fields to 64-bit when needed. The code quality is generally good with proper error handling. Strengths
Issues Found1. Bug: Missing extra field skip in main loop (xbmc/filesystem/ZipManager.cpp:219)After reading the central extra field (lines 206-216), the code should not skip it again at line 219: // Jump after central file header extra field and file comment
mFile.Seek(ze.eclength + ze.clength,SEEK_CUR);Problem: The extra field was already read at line 210 ( // Jump after file comment
mFile.Seek(ze.clength, SEEK_CUR);2. Potential data loss: lhdrOffset field size (xbmc/filesystem/ZipManager.h:55)The unsigned int lhdrOffset = 0; // Relative offset of local headerIn uint64_t lhdrOffset64 = Endian_SwapLE64(ReadUnaligned<uint64_t>(p));
info.lhdrOffset = lhdrOffset64; // Potential truncation!Problem: For ZIP files where the local header is beyond 4GB, this will silently truncate the offset, causing file access to fail. Fix: Change 3. Code inconsistency: strncpy vs strncpy_s (xbmc/filesystem/ZipManager.cpp:203)The change from Concern: This may break the build on Linux/macOS. Did you test this on non-Windows platforms? If 4. Minor: Unused variable (xbmc/filesystem/ZipManager.cpp:363-372)Variables Recommendations
SummaryThe implementation is solid and demonstrates good understanding of the ZIP64 specification. However, issue #1 is a critical bug that will likely prevent parsing ZIP64 files correctly, and issue #2 will cause data loss for files with local headers beyond 4GB. These should be fixed before merging. Issue #3 needs verification on Linux/macOS platforms. |
|
Critical Incorrect file position after reading extra field (ZipManager.cpp:219) Must Fix Type mismatch for lhdrOffset (ZipManager.h:55) strncpy_s is non-portable (ZipManager.cpp:203) Medium Unused variables in ReadZip64EOCD (ZipManager.cpp:363-365, 385) Low Missing validation for ZIP64 offset bounds (ZipManager.cpp:230) |
There was a problem hiding this comment.
AI Code Review by LlamaPReview
🎯 TL;DR & Recommendation
Recommendation: Request Changes
This PR adds ZIP64 support for files >4GB but introduces a security flaw in extra field parsing and incomplete error handling, alongside API changes requiring downstream validation.
📄 Documentation Diagram
This diagram illustrates the enhanced ZIP file processing flow with new ZIP64 support introduced in this PR.
sequenceDiagram
participant U as User
participant ZM as ZipManager
participant CF as CFile
participant SE as SZipEntry
U->>ZM: GetZipList(zipPath)
ZM->>CF: Open file
CF-->>ZM: File handle
ZM->>CF: Read EOCD
note over ZM: PR #35;27692 added ZIP64 detection here
alt is ZIP64
ZM->>CF: Read ZIP64 EOCD
CF-->>ZM: cdirOffset64, cdirSize64
else is ZIP32
CF-->>ZM: cdirOffset, cdirSize
end
ZM->>CF: Seek to central directory
loop for each entry
ZM->>CF: Read central header
CF-->>ZM: Header data
ZM->>ZM: Parse entry (including ZIP64 extra)
note over ZM: PR changed to handle 64-bit sizes and extra fields
ZM-->>U: Return SZipEntry list
end
🌟 Strengths
- Implements necessary ZIP64 functionality to handle large archives.
- Security hardening with strncpy_s replacement.
| Priority | File | Category | Impact Summary | Anchors |
|---|---|---|---|---|
| P1 | xbmc/filesystem/ZipManager.h | Architecture | API change to 64-bit sizes risks downstream compatibility. | SZipEntry |
| P1 | xbmc/filesystem/ZipManager.cpp | Correctness and Business Logic | Incomplete ZIP64 detection error handling could fail valid archives. | |
| P1 | xbmc/filesystem/ZipManager.cpp | Correctness and Business Logic | Security flaw in ZIP64 parsing allows out-of-bounds read. | |
| P2 | xbmc/filesystem/ZipManager.cpp | Performance | Potential overflow in loop condition for very large archives. | |
| P2 | xbmc/filesystem/ZipManager.cpp | Security | Incorrect strncpy_s usage may cause string truncation issues. |
🔍 Notable Themes
- ZIP64 Handling Risks: Multiple issues involve boundary checks and error handling in new ZIP64 logic, posing correctness and security concerns.
- Type Safety: Changes to data types (e.g., uint64_t) require careful validation to prevent implicit conversion bugs.
📈 Risk Diagram
This diagram highlights the security and error handling risks introduced by the new ZIP64 parsing logic in this PR.
sequenceDiagram
participant ZM as ZipManager
participant Buf as Extra Field Buffer
ZM->>ZM: ParseZip64ExtraField(buf, length, info)
ZM->>Buf: Read headerId and dataSize
note over ZM: R5(P1): Missing bounds check before reading data
alt offset + dataSize > length
note over Buf: Risk: Out-of-bounds read if dataSize invalid
else valid bounds
ZM->>Buf: Access data based on dataSize
ZM-->>ZM: Update info with 64-bit values
end
note over ZM: R2(P1): Incomplete error handling in ZIP64 detection
ZM->>ZM: Handle error if ZIP64 EOCD missing
note over ZM: Could cause GetZipList to fail for edge cases
⚠️ **Unanchored Suggestions (Manual Review Recommended)**
The following suggestions could not be precisely anchored to a specific line in the diff. This can happen if the code is outside the changed lines, has been significantly refactored, or if the suggestion is a general observation. Please review them carefully in the context of the full file.
📁 File: xbmc/filesystem/ZipManager.h
Changing SZipEntry::csize and SZipEntry::usize from unsigned int (32-bit) to uint64_t (64-bit) is a structural API change. This directly impacts method:ZipEntryToFileItem in ZipDirectory.cpp which uses entry.usize in item->SetSize(entry.usize). The SetSize method likely expects a 64-bit value, which is now compatible. However, the ZipFile.cpp contains multiple comparisons and calculations involving mZipItem.usize and int64_t positions (e.g., iFilePosition > mZipItem.usize). While uint64_t to int64_t comparisons are well-defined in C++, they can lead to sign conversion warnings or subtle bugs if usize exceeds INT64_MAX. The impact is that all existing code using these fields now handles 64-bit sizes, which is the intended effect of ZIP64 support. No breakage is expected, but the change must be validated across all usages.
Related Code:
struct SZipEntry {
// ...
uint64_t csize = 0; // compressed size
uint64_t usize = 0; // uncompressed size
// ...
};💡 Have feedback? We'd love to hear it in our GitHub Discussions.
✨ This review was generated by LlamaPReview Advanced, which is free for all open-source projects. Learn more.
|
@claude review changes in context of all previous feedback |
|
Claude finished @78andyp's task in 1m 18s —— View job Review of Changes
SummaryGreat work addressing the previous feedback! All critical issues have been fixed. The ZIP64 implementation is now solid and ready to merge. Fixed Issues ✅1. Critical: Double-seek bug (line 227) - FIXED
2. Critical: Type mismatch for lhdrOffset (ZipManager.h:55) - FIXED
3. Portability: strncpy_s (line 211) - FIXED
4. Medium: Bounds checking in ParseZip64ExtraField (line 430) - FIXED
5. Minor: Better error handling for ZIP64 EOCD (line 169-176) - IMPROVED
Additional Improvements ✅Beyond fixing the reported issues, the PR includes several good improvements:
Code QualityThe implementation demonstrates solid understanding of:
Testing RecommendationWhile the implementation looks correct, consider testing these scenarios if not already done:
ConclusionAll previous critical and medium-priority issues have been properly addressed. The code is well-structured, handles errors appropriately, and should correctly support both ZIP32 and ZIP64 formats. Approved for merge from a code review perspective. |
a5abf20 to
9d4cf1b
Compare
5b735f4 to
66ddae5
Compare
|
@claude review |
|
Claude finished @78andyp's task in 1m 4s —— View job Review of ZIP64 Implementation
SummaryExcellent work addressing all the previous feedback! The ZIP64 implementation is now solid and ready to merge. All critical issues have been properly fixed, and the code demonstrates a thorough understanding of the ZIP64 specification. ✅ Previously Identified Issues - ALL FIXED1. Critical: Double-seek bug (ZipManager.cpp:231)
2. Critical: Type mismatch for lhdrOffset (ZipManager.h:55)
3. Portability: strncpy_s (ZipManager.cpp:215)
4. Bounds checking in ParseZip64ExtraField (ZipManager.cpp:435)
5. Error handling for ZIP64 EOCD (ZipManager.cpp:173-179)
🌟 Additional Improvements Beyond FeedbackThe PR includes several excellent improvements that weren't originally requested:
🔍 Code Quality AssessmentThe implementation demonstrates solid understanding of:
📝 Testing RecommendationsWhile the implementation looks correct, consider testing these scenarios if not already done:
🚀 ConclusionAll previous critical and medium-priority issues have been properly addressed. The code is well-structured, handles errors appropriately, and correctly supports both ZIP32 and ZIP64 formats. Approved for merge from a code review perspective. The implementation will enable Kodi to properly handle large archive files (>4GB), addressing the issue identified in #27580 where library zip:// entries were being duplicated with archive://. |
|
@claude review |
|
Claude finished @78andyp's task in 1m 21s —— View job Code Review
SummaryExcellent work on the ZIP64 implementation! The code successfully addresses all previously identified critical issues and demonstrates a thorough understanding of the ZIP64 specification. The implementation is clean, well-tested, and ready for merge. Previous Issues - All Fixed ✅All critical issues from previous reviews have been properly addressed:
Code Quality AssessmentStrengths
Implementation DetailsZIP64 Detection Logic (lines 83, 173): const bool Is64{IsZip64(mFile)};
// ... later ...
if (Is64 && !ReadZip64EOCD(mFile, cdirOffset64, cdirSize64))
Conditional Field Parsing (lines 480-498): if (info.usize == 0xFFFFFFFFu && remaining >= 8) { /* read 64-bit usize */ }
if (info.csize == 0xFFFFFFFFu && remaining >= 8) { /* read 64-bit csize */ }
if (info.lhdrOffset == 0xFFFFFFFFu && remaining >= 8) { /* read 64-bit offset */ }
Two-pass extra field parsing (lines 217-228 and 263-266):
Minor Observations
Testing VerificationThe PR includes comprehensive tests:
ConclusionThis is a well-implemented ZIP64 feature that:
The implementation is production-ready and approved for merge. |
ZIP64 Implementation ReviewFound several issues requiring fixes: Critical Integer overflow in ZipFile.cpp:303 (ZipFile.cpp:303) Must Fix Integer underflow risk in ZipFile.cpp:341 (ZipFile.cpp:341) Buffer overflow risk in ZipManager.cpp:209 (ZipManager.cpp:209) Medium Narrowing conversion in ZipDirectory.cpp:35 (ZipDirectory.cpp:35) Low Missing error handling (ZipManager.cpp:242-244) See inline comments for detailed fixes. |
There was a problem hiding this comment.
Follow-up Review: ZIP64 Support Implementation
1. Status Update
Since my last review, the PR has undergone significant improvements addressing most of my previous concerns. The ZIP64 support implementation has matured considerably with:
- Resolved Issues: 90% of previously identified concerns have been addressed
- New Test Coverage: Added comprehensive tests for ZIP64 functionality
- Code Quality: Improved error handling, type safety, and architectural consistency
- Documentation: Better code organization and comments
The author has demonstrated responsiveness to feedback and made substantial progress toward production-ready code.
2. Detailed Analysis
2.1 Major Improvements Since Last Review
a) ZIP64 Detection and Handling
The new IsZip64() function provides robust detection by scanning the end of files for ZIP64 signatures:
bool CZipManager::IsZip64(CFile& file) {
const int64_t size{std::min<int64_t>(file.GetLength(), 1024)};
// ... scans for ZIP64_END_CENTRAL_HEADER signature
}This is a solid approach that:
- Limits I/O to the last 1KB (efficient)
- Uses
std::ranges::searchfor pattern matching - Handles edge cases with proper bounds checking
b) ZIP64 Extended Information Parsing
The ParseZip64ExtraField() function now properly handles the ZIP64 extended information extra field (ID 0x0001):
void CZipManager::ParseZip64ExtraField(const char* buf, uint16_t length, SZipEntry& info) {
uint16_t offset = 0;
while (offset + 4 <= length) {
uint16_t headerId = Endian_SwapLE16(ReadUnaligned<uint16_t>(buf + offset));
// ... processes 64-bit fields
}
}This correctly:
- Parses the variable-length extra field format
- Handles all three 64-bit fields (usize, csize, lhdrOffset)
- Maintains proper bounds checking
c) Type Safety Improvements
The codebase now consistently uses proper types:
uint64_tfor all size/offset fields inSZipEntryint64_tfor file positions and seeks- Proper casting where necessary
d) Test Coverage
New comprehensive tests verify:
- ZIP64 file reading (
Read64) - Large file handling (
BigRead64- 5GB test file) - Directory listing (
Zip64DirectoryListing) - Extended information parsing (
Zip64ExtendedInformation)
2.2 Specific Code Improvements
a) ZipManager.cpp Enhancements
The central directory processing now properly handles ZIP64:
// Handle ZIP64 if needed
uint64_t cdirOffset64 = cdirOffset;
uint64_t cdirSize64 = cdirSize;
if (IsZip64(mFile) &&
(cdirSize == 0xFFFFFFFFu || cdirOffset == 0xFFFFFFFFu)) {
if (!ReadZip64EOCD(mFile, cdirOffset64, cdirSize64)) {
CLog::LogF(LOGERROR, "ZIP64 EOCD invalid in {}", strFile);
return false;
}
}b) Error Handling Improvements
File operations now include proper error checking:
if (mFile.Seek(static_cast<int64_t>(cdirOffset64), SEEK_SET) == -1)
return false;c) Memory Safety
The ZipEntryToFileItem function now safely handles large files:
item->SetSize(static_cast<int64_t>(entry.usize));2.3 Architectural Considerations
The implementation maintains good separation of concerns:
- ZIP64 detection is isolated in
IsZip64() - Extended field parsing is in
ParseZip64ExtraField() - EOCD reading is in
ReadZip64EOCD() - Core ZIP handling remains unchanged for non-ZIP64 files
This modular approach makes the code:
- Easier to maintain
- More testable
- Less prone to regression
3. Outstanding Concerns
While most issues have been addressed, a few minor concerns remain:
3.1 Potential Integer Overflow
Location: ZipManager.cpp line 180 (loop condition)
Issue: The cast of cdirOffset64 + cdirSize64 to int64_t could theoretically overflow for extremely large archives (>9.2EB). While practically unlikely, it's worth noting.
Recommendation: Consider adding an overflow check or using unsigned comparison:
while (static_cast<uint64_t>(mFile.GetPosition()) < cdirOffset64 + cdirSize64)3.2 Error Message Consistency
Issue: Some error messages use LOGDEBUG when they should use LOGERROR for actual failure cases.
Example:
CLog::LogF(LOGDEBUG, "Broken file {}!", strFile); // Should be LOGERROR3.3 Windows-Specific Code
Issue: The use of strncpy_s was reverted to strncpy for cross-platform compatibility. While understandable, this loses some safety guarantees.
Recommendation: Consider implementing a platform-independent safe string copy utility.
3.4 Test File Availability
Issue: The test files (dummy64.zip, reffile.txt.zip64.zip) are binary and not suitable for diff analysis. While this is expected, it makes reviewing the test cases more challenging.
Recommendation: Add comments in the test file describing:
- Expected size of test files
- Contents/structure
- ZIP64-specific attributes
4. Recommendations
4.1 High Priority (Should Address Before Merge)
- Fix error message severity levels (LOGDEBUG → LOGERROR for actual errors)
- Add overflow protection for the 64-bit position comparison
- Verify all seek operations have proper error checking (most do, but double-check)
4.2 Medium Priority (Could Address Before Merge)
- Implement platform-independent safe string copy instead of reverting to
strncpy - Add more detailed comments about the ZIP64 format in header files
- Consider adding size limits for extremely large archives (security consideration)
4.3 Low Priority (Post-Merge Improvements)
- Performance optimization for very large ZIP64 files (streaming vs. full indexing)
- Memory usage analysis for archives with many small files
- Additional edge case tests (corrupted ZIP64 headers, mixed ZIP32/ZIP64 files)
5. Final Assessment
The ZIP64 implementation has made excellent progress and is now very close to merge-ready. The author has:
✅ Successfully addressed most previous concerns
✅ Added comprehensive test coverage
✅ Improved type safety throughout the codebase
✅ Maintained good architectural separation
✅ Added proper error handling
The remaining issues are minor and don't affect core functionality. With the high-priority recommendations addressed, this PR would be ready for merging.
Quality Rating: 9/10 (Excellent implementation with minor polish needed)
Confidence Level: High (Code is production-ready with minor tweaks)
Merge Recommendation: Approve with requested changes
The implementation demonstrates:
- Deep understanding of the ZIP64 specification
- Careful attention to edge cases
- Good integration with existing code
- Comprehensive testing approach
This will be a valuable addition to Kodi's filesystem capabilities, enabling support for large archive files while maintaining backward compatibility.
💡 LlamaPReview Community
Have feedback on this AI Code review tool? Join our GitHub Discussions to share your thoughts and help shape the future of LlamaPReview.
|
@claude review recent amendments in context of copilot/llama suggestions. Note - I'm not concerned about files >9EB |
|
Claude finished @78andyp's task in 1m 29s —— View job Review of Recent Amendments
SummaryExcellent work addressing the automated review feedback! All critical issues from the copilot and llamapreview reviews have been properly fixed. The ZIP64 implementation is now robust and production-ready. Issues Successfully Resolved ✅1. Integer overflow in ZipFile.cpp:303 - ✅ FIXEDPrevious issue: Fix verified (ZipFile.cpp:303-306): const uInt avail = (uiBufSize - iDecompressed) > UINT_MAX
? UINT_MAX
: static_cast<uInt>(uiBufSize - iDecompressed);
m_ZStream.avail_out = avail;This properly clamps the value to 2. Integer underflow in ZipFile.cpp:341 - ✅ FIXEDPrevious issue: Unsigned subtraction Fix verified (ZipFile.cpp:343-346): if (m_iFilePos < 0 || static_cast<uint64_t>(m_iFilePos) >= mZipItem.csize)
return 0;
if (uiBufSize + static_cast<uint64_t>(m_iFilePos) > mZipItem.csize)
uiBufSize = mZipItem.csize - m_iFilePos;This adds proper validation before the subtraction:
Perfect fix that prevents both underflow and invalid reads! 3. Buffer overflow in ZipManager.cpp:209 - ✅ FIXEDPrevious issue: Using Fix verified (ZipManager.cpp:211-214): memset(ze.name, 0, 255);
const size_t copyLen = std::min(strName.size(), size_t{254});
std::memcpy(ze.name, strName.data(), copyLen);
ze.name[copyLen] = '\0';You went with an even better solution - using
Excellent choice! This is actually superior to the bot's suggestion. 4. Narrowing conversion in ZipDirectory.cpp:35 - ✅ ACKNOWLEDGEDIssue: Your response: "I think we can ignore this" My assessment: I agree with you. While theoretically ZIP64 supports files up to 2^64 bytes (16 exabytes), in practice:
The code is fine as-is. If a file somehow did exceed Code Quality AssessmentThe recent changes demonstrate: 1. Proper bounds checking: All arithmetic operations now have appropriate overflow/underflow protection. 2. Modern C++ practices:
3. Defensive programming:
4. ZIP64 implementation remains solid:
Additional Observations1. ZipFile.cpp type consistency
This is exactly right for ZIP64 support where sizes can be 64-bit but individual buffer operations are constrained by zlib's API. 2. Extra field parsing if (offset + dataSize > length)
break;This properly prevents out-of-bounds reads in malformed ZIP files. Good security practice. 3. Test coverage
This gives good confidence in the implementation. Final VerdictAll critical issues raised by the automated reviews have been properly addressed. The implementation is: ✅ Safe: No buffer overflows, integer overflows, or underflows The code quality is excellent and demonstrates good understanding of both the ZIP64 format and defensive programming practices. Recommendation: Approve and merge - all issues have been resolved and the implementation is solid. |
neo1973
left a comment
There was a problem hiding this comment.
I didn't give the code a thorough look. Runtime testing was fine, so +1 from me.
|
Thanks |
|
Looks like your getting a test failure on Freebsd CI - https://jenkins.kodi.tv/job/FreeBSD/30138/testReport/ |
|
Hi @fuzzard |
Description
Add support for ZIP64 - files > 4GB in size.
Motivation and context
As part of the testing for #27580 I discovered the native zip support in Kodi is only ZIP32. The Archive VFS extension adds ZIP64 support but there is an issue with library zip:// entries being duplicated with archive://.
How has this been tested?
Locally on a ZIP file that is > 4GB in size. It plays the contained MKV.
New tests
What is the effect on users?
Screenshots (if appropriate):
Types of change
Checklist: