Thanks to visit codestin.com
Credit goes to github.com

Skip to content

micropython/utarfile: Support creating tar files. #659

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from

Conversation

dpwe
Copy link
Contributor

@dpwe dpwe commented May 14, 2023

Currently utarfile supports a small subset of the cpython tarfile functionality to read items from an existing tar archive.
This adds a mode='w' option to the constructor, and a tarfile.add('filename') method to support creating tar files.
Only regular files and directories are supported.

@bwhitman
Copy link
Contributor

thank you @dpwe! this is very helpful

@dpwe dpwe changed the title utarfile: Support creating tar files. micropython/utarfile: Support creating tar files. May 14, 2023
@jimmo
Copy link
Member

jimmo commented May 16, 2023

Thanks @dpwe this looks useful.

Something we need to consider is code size -- someone who just wants to read tar files shouldn't need to pay the extra flash cost. So to that end we've started splitting packages into base and extension packages. This might be a good candidate for that.

The approach is:

  • rename utarfile.py to utarfile/__init__.py (i.e. utarfile is now a package rather than a module).
  • update manifest.py to use package("utarfile") rather than module("utarfile.py")
  • add a new library package "utarfile-create" with utarfile/create.py and a manifest that has require("utarfile") and package("utarfile"). (Other suggestions welcome for the name.. maybe utarfile-add?).
  • update utarfile/init.py to try to do from . import create and handle the ImportError gracefully, and add the extra methods to the TarFile class (either with thunks or by conditionally making TarFile inherit from a class provided by create.py (lots of options here).

Conceptually the idea is that installing the utarfile-create library package adds an extra optional file to the utarfile package that extends the behavior. You can see this in use in collections, unittest, mip, aioble (they all actually do this in slightly different ways which suit each of the packages).

(Note: I'm using both meanings of "package" in the above... see https://github.com/micropython/micropython-lib/#notes-on-terminology)

Unrelated to this PR... just a note while we're looking at tarfile. I think we should rename utarfile to tarfile and move it to python-stdlib. Need to verify that it matches the CPython API though. See #540 for thoughts on renaming of urequests (and backwards compatibility).

@dpwe
Copy link
Contributor Author

dpwe commented May 18, 2023

Thanks. I converted to a package and split the tarfile writing into a separate utarfile-write extension that layers on top of utarfile, with minimal additions to the original module.

Because the new methods are top-level members of the TarFile class, I had to monkeypatch them in in utarfile/init.py, let me know if this looks OK.

I also added "append" mode support since it was a small change.

@jimmo
Copy link
Member

jimmo commented May 19, 2023

Because the new methods are top-level members of the TarFile class, I had to monkeypatch them in in utarfile/init.py, let me know if this looks OK.

Thanks @dpwe -- yeah there's basically only three ways to do it -- some form of monkeypatch, stubs, or the base-class method I described.

Overall the implementation looks great, and thanks for adding the append and examples.

The monkey-patching can be done a bit simpler though (which mostly results in less code size), and I think there are some useful additions that you put in write.py that are worth moving into the base (e.g. the context manager). Also, although it's neater to have the __init__.py / utarfile.py split, it comes at a cost.

I had a quick go at implementing this (and a few other size optimisations along the way)...
f902c54

Overall this goes from 235+1316 for __init__.py+utarfile.py plus 1952 bytes for write.py (total 3503), to 1555 for __init__.py and 1543 for write.py (total 3098).

@dpwe
Copy link
Contributor Author

dpwe commented May 19, 2023

Yes, looks great. That's a nice size reduction!

I was a bit uncomfortable with the ambiguity in how we represent if the TarInfo is a directory (between .type and .mode). Your unification is great.

Do you want me to do anything more? Should I update my PR to use your version?

@dpwe
Copy link
Contributor Author

dpwe commented May 21, 2023

I copied your changes into this branch and verified that everything still works.

@jimmo
Copy link
Member

jimmo commented May 21, 2023

Thanks @dpwe, looks good.

One thing that is worth pointing out is that as a result of it going from being a module to a package is that if someone already has it installed and then updates to the latest version, mip will not remove the old utarfile.py from the filesystem. However, packages of the same name always take precedence over .py files (it importer does dir, .py, .mpy in that order), so this should not be an issue.

It will also need a version bump to manifest.py, but I will do that when I merge this (tomorrow).



# Inject extra functionality into TarFile.
from . import TarFile, TarInfo
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a circular import, but I think it works because the package (__init__) is already loaded by this point.

Would it be smaller code to do the following within the TarFile class:

try:
    from .write import _open_write, addfile, add, close
except:
    pass

??

"typeflag": (uctypes.ARRAY | 156, uctypes.UINT8 | 1),
}

# Following https://github.com/python/cpython/blob/3.11/Lib/tarfile.py
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the code here copy anything verbatim from that Python file? Just need to be careful of licensing.

@dpgeorge
Copy link
Member

Thanks @dpwe , this is a really nice addition!

@jimmo
Copy link
Member

jimmo commented May 22, 2023

Squashed and merged in 7128d42 (including suggestions from @dpgeorge). Thanks again @dpwe !

@jimmo jimmo closed this May 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants