-
Notifications
You must be signed in to change notification settings - Fork 85
extremely slow scan for "copyright" files #278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This can easily happen on Debian when the dpkg cache is on a slow drive and is very large. For instance, in (temporary) CI environments, Docker containers etc., the scan is really fast. Ever since I replaced my SSHD (HDD + SSD cache) with a real SSD, I haven't had any issues with this any more; before, linuxdeploy took 2-3 seconds per item. |
I've had the cache in time dpkg-query -S '*/xx1' '*/xx2' '*/xx3' 2>/dev/null
## Executed in 958.89 millis Doesn't matter if the patterns are found or not found. Debian 12, Ubuntu 22.04, Ubuntu 20.04 — some slower than others. I have no idea what it's doing, since Anyway, the first improvement was to use xargs to run multiple
But the real improvement (blazing fast) came by ditching
( I might turn this into a plugin. Problem is, I don't really know what LD deploys. I'm just searching for identical library names, and the setup is specific to my app. I'd still like to obtain a map of which files where automagically deployed to which location. Maybe I (shudder) could parse the |
Nobody's ever suggested machine readable output from this application. I'm open to such features, but please open another issue.
I'm not sure I want to implement my own parser for those files.
linuxdeploy runs strictly sequentially. It primarily tries to speed things up by trying to avoid running the same operation twice on a file. Adding any kind of parallel operations would be a lot of work given the lack of proper C++ libraries for such a purpose (Python would make it quite easy, though).
Well, the code is free/open-source, so you can have a look for yourself. While it may be C++, you should be able to get a basic idea of what it does from there. You can grep for |
Cool, I think that makes sense given linuxdeploy's strenghts.
Yeah, not a good idea and I'm not proposing that. I'm thinking more along the lines of making more operations pluggable by external scripts.
Yeah, me neither, and it's ultimately a In the case at hand, FWIW, I think
I did, obviously :) (and I've had a long stint with C++, though years pass and C++ mutates). "I don't know" meant there is no machine-readable output, which goes back to point 0 — thanks for clarifying that there is no current design |
First of all, thank you for
DISABLE_COPYRIGHT_FILES_DEPLOYMENT
, it really saves the day!Right now each library / executable is scanned running a separate
dpkg -S
. Then linuxdeploy checks for/usr/share/doc/PKG/copyright
. This is extremely slow.dpkg -S
actually supports multiple query paths in a single call, in case you want to batch the calls.Another solution would be to add an
--output-deployed-files FNAME
to output the list of files for which the copyright files need to be found. After all, linuxdeploy has valuable logic to include / exclude deployed files from lookup.I currently extract the AppImage created by linuxdeploy, list all
.so
libraries, look them up all at once in a singledpkg -S
operation, and copy the results as appropriate. But mine is a simple case.I've noticed that there already is a framework for deferred operations, so maybe that can be co-opted.
The text was updated successfully, but these errors were encountered: