Thanks to visit codestin.com
Credit goes to github.com

Skip to content

gh-51067: Add remove() and repack() to ZipFile #134627

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 60 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
6aed859
Add `remove()` and `repack()` to `ZipFile`
danny0838 May 24, 2025
5453dbc
📜🤖 Added by blurb_it.
blurb-it[bot] May 24, 2025
80ab2e2
Fix and optimize test code
danny0838 May 24, 2025
72c2a66
Handle common setups with `setUpClass`
danny0838 May 24, 2025
a4b410b
Add tests for mode `w` and `x` for `remove()`
danny0838 May 24, 2025
a9e85c6
Introduce `_calc_initial_entry_offset` and refactor
danny0838 May 24, 2025
236cd06
Optimize `_calc_initial_entry_offset` by introducing cache
danny0838 May 24, 2025
bdc58c7
Introduce `_validate_local_file_entry` and refactor
danny0838 May 24, 2025
c3c8345
Introduce `_debug` and refactor
danny0838 May 24, 2025
1b7d75a
Introduce `_move_entry_data` and rework chunk_size passing
danny0838 May 25, 2025
51c9254
Refactor `_validate_local_file_entry`
danny0838 May 25, 2025
0d971d8
Add `strict_descriptor` option
danny0838 May 25, 2025
8f0a504
Fix and improve validation tests
danny0838 May 25, 2025
0cb8682
Remove obsolete NameToInfo updating
danny0838 May 25, 2025
a788a00
Use `zinfo` rather than `info`
danny0838 May 25, 2025
ae01b8c
Raise on overlapping file blocks
danny0838 May 25, 2025
edee203
Rework writing protection
danny0838 May 25, 2025
555ac78
Update doc
danny0838 May 25, 2025
95fde31
Fix typo
danny0838 May 26, 2025
8a448e4
Add test for bytes between file entries
danny0838 May 26, 2025
4c35eb2
Check `testzip()` after zip file closed
danny0838 May 26, 2025
926338c
Support `repack(removed)`
danny0838 May 26, 2025
e76f9a1
Fix bytes between entries be removed when `removed` is passed
danny0838 May 26, 2025
93f4c25
Fix bad test code
danny0838 May 26, 2025
9e94209
Revise docstring
danny0838 May 27, 2025
3ef72c6
Add `tearDown` for tests
danny0838 May 28, 2025
fbf7588
Rename methods and parameters
danny0838 May 28, 2025
81a419a
Adjust parameter order
danny0838 May 28, 2025
c62a455
Optimize code and revise comment
danny0838 May 28, 2025
a05353c
Improve debug for `_ZipRepacker.repack()`
danny0838 May 29, 2025
3d0240c
Rework `_validate_local_file_entry_sequence` to return size or None
danny0838 May 29, 2025
31c4c93
Rework `_validate_local_file_entry_sequence` to allow passing no `che…
danny0838 May 29, 2025
f8fade1
Introduce `_scan_data_descriptor_no_sig_by_decompression`
danny0838 May 30, 2025
c80d21b
Strip only entries immediately following a referenced entry
danny0838 May 29, 2025
e1caea9
Adjust method names
danny0838 May 30, 2025
2b23d46
Add memory usage test
danny0838 May 30, 2025
de4f15b
Fix rst
danny0838 May 30, 2025
ea3259f
Optimize code
danny0838 Jun 1, 2025
fef92c4
Fix and optimize `_iter_scan_signature`
danny0838 Jun 1, 2025
8067b0c
Fix `_scan_data_descriptor`
danny0838 Jun 1, 2025
92d3a9c
Fix and optimize `_scan_data_descriptor_no_sig`
danny0838 Jun 1, 2025
b5d7ae3
Rename `_trace_compressed_block_end`
danny0838 Jun 1, 2025
1d5ec61
Fix `_scan_data_descriptor_no_sig_by_decompression`
danny0838 Jun 1, 2025
db9d0d6
Add tests for `_ZipRepacker`
danny0838 Jun 1, 2025
aaa566c
Remove unneeded import
danny0838 Jun 1, 2025
578c7c8
Add requirements
danny0838 Jun 1, 2025
c470c33
Fix `_scan_data_descriptor_no_sig_by_decompression` when library not …
danny0838 Jun 1, 2025
b1dcb07
Test with pre-calculated CRC
danny0838 Jun 1, 2025
04cddef
Remove unneeded import
danny0838 Jun 1, 2025
797a62c
Fix and optimize `repack`
danny0838 Jun 1, 2025
3b2f232
Remove unneeded catch type
danny0838 Jun 14, 2025
cb549c9
Patch more explicitly
danny0838 Jun 14, 2025
0f50a6f
Remove unneeded variables
danny0838 Jun 14, 2025
c759b63
Improve dependency check for decompression tests
danny0838 Jun 14, 2025
1ece5b1
Refactor and optimize `RepackHelperMixin`
danny0838 Jun 14, 2025
ce88616
Update NEWS
danny0838 Jun 20, 2025
5f093e5
Sync with danny0838/zipremove@1691ca25bf971cf1e45d5ed7d22c512636f20cb8
danny0838 Jun 20, 2025
11c0937
Revise NEWS
danny0838 Jun 20, 2025
4b2176e
Sync with danny0838/zipremove@1843d87b70e6cb129fb55446eaf4486a87d2af4d
danny0838 Jun 21, 2025
d9824ce
Fix timezone related timestamp issue
danny0838 Jun 21, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 60 additions & 0 deletions Doc/library/zipfile.rst
Original file line number Diff line number Diff line change
Expand Up @@ -518,6 +518,66 @@ ZipFile Objects
.. versionadded:: 3.11


.. method:: ZipFile.remove(zinfo_or_arcname)

Removes a member from the archive. *zinfo_or_arcname* may be the full path
of the member or a :class:`ZipInfo` instance.

If multiple members share the same full path, only one is removed when
a path is provided.

This does not physically remove the local file entry from the archive.
Call :meth:`repack` afterwards to reclaim space.

The archive must be opened with mode ``'w'``, ``'x'`` or ``'a'``.

Returns the removed :class:`ZipInfo` instance.

Calling :meth:`remove` on a closed ZipFile will raise a :exc:`ValueError`.

.. versionadded:: next


.. method:: ZipFile.repack(removed=None, *, \
strict_descriptor=False[, chunk_size])

Rewrites the archive to remove stale local file entries, shrinking its file
size.

If *removed* is provided, it must be a sequence of :class:`ZipInfo` objects
representing removed entries; only their corresponding local file entries
will be removed.

If *removed* is not provided, the archive is scanned to identify and remove
local file entries that are no longer referenced in the central directory.
The algorithm assumes that local file entries (and the central directory,
which is mostly treated as the "last entry") are stored consecutively:

#. Data before the first referenced entry is removed only when it appears to
be a sequence of consecutive entries with no extra following bytes; extra
preceding bytes are preserved.
#. Data between referenced entries is removed only when it appears to
be a sequence of consecutive entries with no extra preceding bytes; extra
following bytes are preserved.
#. Entries must not overlap. If any entry's data overlaps with another, a
:exc:`BadZipFile` error is raised and no changes are made.

When scanning, setting ``strict_descriptor=True`` disables detection of any
entry using an unsigned data descriptor (deprecated in the ZIP specification
since version 6.3.0, released on 2006-09-29, and used only by some legacy
tools). This improves performance, but may cause some stale entries to be
preserved.

*chunk_size* may be specified to control the buffer size when moving
entry data (default is 1 MiB).

The archive must be opened with mode ``'a'``.

Calling :meth:`repack` on a closed ZipFile will raise a :exc:`ValueError`.

.. versionadded:: next


The following data attributes are also available:

.. attribute:: ZipFile.filename
Expand Down
Loading
Loading