-
-
Notifications
You must be signed in to change notification settings - Fork 8.3k
natmod: Allow linking with static libraries #15838
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #15838 +/- ##
==========================================
- Coverage 98.59% 98.58% -0.01%
==========================================
Files 167 167
Lines 21617 21596 -21
==========================================
- Hits 21313 21291 -22
- Misses 304 305 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Code size report:
|
.a
static librariesfaa9d08
to
ceabe58
Compare
@vshymanskyy nice! On RV32 it's able to pull softfp symbols from libgcc when building A couple of things worth mentioning... Is a new pip dependency really needed? The archive format isn't that complex to write a reader for, as you already know, and the dependency in question isn't so much code to require a separate installation step. Regarding collecting all duplicate symbol messages and print them all at once, (warning: personal preference) it might be useful whilst developing the feature but I'm not sure how much use a final user would get out of it. |
@agatti Thanks for testing it, glad it worked out of the box. No, I really don't want to write yet another AR parser. The format is not actually simple, it's tricky to get right as seen from vidstige/ar#17 and similar pull requests on that repo (and alternatives, which don't work due to the same reasons). Collecting unresolved symbol messages and print them all at once provides much better insight into the scope of work and leads to much better decisions, which is why all "big" linkers do this (llvm, gcc, even tinycc afaik). And.. it will definitely help with issue reports on github. |
@vshymanskyy for the symbol messages, I meant showing them all to the user, my bad. Collecting is one thing, but displaying the whole lot may be a bit too much for the final user, especially if there are a lot of them - it's still an exception being raised so there's possibly quite a bit of text to read (traceback plus entries). Again, personal preference here. I'm still a bit torn about having an extra dependency. I mean, it does work, don't get me wrong! It's just that it looks like a small dependency compared to pyelftools, that's it. But it's not up to me to decide :) Still, this just unlocked one more natmod working on RV32, thanks! |
ceabe58
to
275a1dd
Compare
It turns out, that some /* GNU ld script
*/
OUTPUT_FORMAT(elf64-x86-64)
GROUP ( /usr/lib/x86_64-linux-gnu/libm-2.38.a /usr/lib/x86_64-linux-gnu/libmvec.a ) I've also updated this PR to handle this. |
@vshymanskyy This looks like a super contribution to the natmod functionality. In addition to the mentioned issue, I think this would also close #5629 I am not a reviewer for the MicroPython project, but I took a quick look at the code, and the integration into mpy_ld looks very clean. I also like that it is enabled by default without needing special configuration, so that it will just-work for native module writers. It might be useful to now use floating point in one of the existing modules (or a new one) - such that this functionality gets tested automatically by CI - and prevent breakage in the future. |
Thanks for comments, I think these are all very relevant. Regarding size increase, it wont happen automatically, the developer will have to explicitly use this feature. But the docs need to be added for sure. |
I noticed now that it was opt-in. That is probably a smart choice, since there is an added Python dependency (ar). And that actually depends on Python 3.11, which I believe is more recent than the minimum version required by the MicroPython tooling per now. I tested this briefly in emlfft module in https://github.com/emlearn/emlearn-micropython (where before I had a lot of manual .a files). It built fine on all platforms, and I could delete all the dirty hacks! Hope to test more soon :) |
I don't think |
546005f
to
6e66a21
Compare
I do not know the details of ar. But on the default Python version on ubuntu-latest Docker image I got the following exception. The function file_digest was added to hashlib in Python 3.11, and updating to Python 3.11 resolved the issue.
|
@jonnor yes, that was used in my code. But it's already fixed in the latest version of this PR. |
@vshymanskyy thanks, can confirm that the lastest works with Python 3.10. It also builds all the modules in emlearn-micropython (5 different DSP/ML algorithms) on armv7m/armv7emsp/xtensawin. However one of the modules fails to find one symbol (out of many) on armv6m. See build log attached below
|
6e66a21
to
02d5060
Compare
@jonnor please try the latest version. These symbols are weak and |
The documentation and CI tests are added. One test fails: |
@vshymanskyy I tested your latest commit now (git SHA 02d5060). The module above that failed to build now builds. However, at least one of the other modules now fails with a duplicate symbol:
|
8677961
to
b1b1294
Compare
e08c7a4
to
b9293ca
Compare
Signed-off-by: Volodymyr Shymanskyy <[email protected]>
b9293ca
to
a22b653
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really impressive and useful, @vshymanskyy!
I have some minor inline comments, but as far as I'm concerned none of them are blockers for merging this as-is.
git clone --depth 1 --branch $IDF_VER https://github.com/espressif/esp-idf.git | ||
# doing a treeless clone isn't quite as good as --shallow-submodules, but it | ||
# is smaller than full clones and works when the submodule commit isn't a head. | ||
git -C esp-idf submodule update --init --recursive --filter=tree:0 | ||
./esp-idf/install.sh | ||
# Install additional packages into the IDF env |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Install additional packages into the IDF env | |
# Install additional packages for mpy_ld into the IDF env |
You may also want to delete this fragment of Lines 496 to 499 in e44a2c6
|
@projectgus @dpgeorge I believe @vshymanskyy said on Discord that he is rather preoccupied at the moment (the situation in Ukraine remains difficult). So I am sure he would appreciate it if anyone can help out with the small things remaining to get this unblocked and merged :) |
If nobody else has time for this I can put the finishing touch to this PR to let it go through, the hard part is already done anyway. |
@@ -1465,6 +1490,7 @@ def main(): | |||
cmd_parser.add_argument("--arch", default="x64", help="architecture") | |||
cmd_parser.add_argument("--preprocess", action="store_true", help="preprocess source files") | |||
cmd_parser.add_argument("--qstrs", default=None, help="file defining additional qstrs") | |||
cmd_parser.add_argument("-l", dest="libs", action="append", help="Static .a libraries to link") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest adding a long form name --libs
or --lib
for this (as well as keeping -l
).
And lowercase "static .a ..." to match the help of other commands.
@agatti yes please, if you could pick this up that would be great! I guess you'll need to make a new PR because you won't have write access to this branch. |
Right, I have a version that incorporates most of the changes requested in this PR - however the code changed a bit since the first time I tried and whilst this fully works on ARMv7m and normal soft-float stuff works on RV32 (adding On Picolibc there's no real libm file to speak of, as the floating point code is in So, I can see two options for this:
I'd go for the former option, |
I'm pretty sure more than just ARMv7m is supported by this PR. According to the top post, there are many architectures supported. But that's with newlib, of course.
I suggest getting this PR across the line first, then adding Picolibc support as a follow up. That could all still fit within the 1.25.0 release. (I don't think the amount of work would be that different if you added Picolibc support together with this PR, or as a follow up, right?) The release is at least one week away, so you have that much time, at least. Note that this PR is not critical for the release and can be left until after the release if needed. |
Indeed they're all newlib-based. The only other platform that requires soft-float is xtensa as far as I can see, but I'm looking at CI targets right now and natmods aren't built for it.
The amount of work would be the same, but I do have a chunk of contiguous free time available now so I'd rather finish this off in one go if I can. For other things and follow up PRs, I may or may not have the time for them in the next few weeks or so (although w.r.t. inline LX6/LX7 assembler I might have a business case for it so it's different). If I happen to not have the time I don't want to keep you folks waiting :)
Fair enough, I'll see what I can do then. I'll clean up newlib support and submit a PR so there's at least something that can be merged in the meantime. |
@agatti I'm pretty sure that |
Glad you're here :) Anyway, I'm quite sure this doesn't seem to happen on the RV32 compiler version for Ubuntu 22.04, the same running in CI jobs - unless I'm missing something that is. Here's what's going on my environment: [/micropython/examples/natmod/features2] $ riscv64-unknown-elf-gcc --version
riscv64-unknown-elf-gcc () 10.2.0
[/micropython/examples/natmod/features2] $ riscv64-unknown-elf-gcc -march=rv32imac -mabi=ilp32 --print-file-name=picolibc.specs
/usr/lib/gcc/riscv64-unknown-elf/10.2.0/picolibc.specs
[/micropython/examples/natmod/features2] $ riscv64-unknown-elf-gcc -specs=/usr/lib/gcc/riscv64-unknown-elf/10.2.0/picolibc.specs -march=rv32imac -mabi=ilp32 --print-file-name=libgcc.a
/usr/lib/gcc/riscv64-unknown-elf/10.2.0/rv32imac/ilp32/libgcc.a
[/micropython/examples/natmod/features2] $ riscv64-unknown-elf-gcc -specs=/usr/lib/gcc/riscv64-unknown-elf/10.2.0/picolibc.specs -march=rv32imac -mabi=ilp32 --print-file-name=libm.a
libm.a
[/micropython/examples/natmod/features2] $ riscv64-unknown-elf-gcc -specs=/usr/lib/gcc/riscv64-unknown-elf/10.2.0/picolibc.specs -march=rv32imac -mabi=ilp32 --print-file-name=libc.a
libc.a
[/micropython/examples/natmod/features2] $ cat /usr/lib/gcc/riscv64-unknown-elf/10.2.0/picolibc.specs
[...]
*link:
[...] -L%{-picolibc-prefix=*:%*/lib/picolibc/riscv64-unknown-elf/lib/%M;:/usr/lib/picolibc/riscv64-unknown-elf/lib/%M} [...]
[/micropython/examples/natmod/features2] $ ll /usr/lib/picolibc/riscv64-unknown-elf/lib/rv32imac/ilp32
[...]
-rw-r--r-- 1 root root 9644250 Nov 19 2021 libc.a
[...]
-rw-r--r-- 1 root root 1444 Nov 19 2021 libm.a I only have a picolibc-based environment on a Ubuntu 22.04 VM - I'll create a different environment to see if this is still the case elsewhere. My changes to
|
OK, unless I manage to get the specs-driven path resolution working in the CI environment (and if somebody can help...) I can at least get things to compile with this: diff --git a/py/dynruntime.mk b/py/dynruntime.mk
index 5592db5fa..807befb46 100644
--- a/py/dynruntime.mk
+++ b/py/dynruntime.mk
@@ -110,9 +110,12 @@ CFLAGS_ARCH += -march=rv32imac -mabi=ilp32 -mno-relax
# bare metal RISC-V toolchain with Picolibc rather than Newlib, and the default
# is "nosys" so a value must be provided. To avoid having per-distro
# workarounds, always select Picolibc if available.
-PICOLIBC_SPECS = $(shell $(CROSS)gcc --print-file-name=picolibc.specs)
+PICOLIBC_SPECS := $(shell $(CROSS)gcc --print-file-name=picolibc.specs)
ifneq ($(PICOLIBC_SPECS),picolibc.specs)
-CFLAGS_ARCH += --specs=$(PICOLIBC_SPECS)
+CFLAGS_ARCH += -specs=$(PICOLIBC_SPECS)
+USE_PICOLIBC := 1
+PICOLIBC_ARCH := rv32imac
+PICOLIBC_ABI := ilp32
endif
MICROPY_FLOAT_IMPL ?= none
@@ -125,8 +128,42 @@ MICROPY_FLOAT_IMPL_UPPER = $(shell echo $(MICROPY_FLOAT_IMPL) | tr '[:lower:]' '
CFLAGS += $(CFLAGS_ARCH) -DMICROPY_FLOAT_IMPL=MICROPY_FLOAT_IMPL_$(MICROPY_FLOAT_IMPL_UPPER)
ifeq ($(LINK_RUNTIME),1)
+# All of these picolibc-specific directives are here to work around a
+# limitation of Ubuntu 22.04's RISC-V bare metal toolchain. In short, the
+# specific version of GCC in use (10.2.0) does not seem to take into account
+# extra paths provided by an explicitly passed specs file when performing name
+# resolution via `--print-file-name`.
+#
+# If Picolibc is used and libc.a fails to resolve, then said file's path will
+# be computed by searching the Picolibc libraries root for a libc.a file in a
+# subdirectory whose path is built using the current `-march` and `-mabi`
+# flags that are passed to GCC. The `PICOLIBC_ROOT` environment variable is
+# checked to override the starting point for the library file search, and if
+# it is not set then the default value is used, assuming that this is running
+# on an Ubuntu 22.04 machine.
+#
+# This should be revised when the CI base image is updated to a newer Ubuntu
+# version (that hopefully contains a newer RISC-V compiler) or to another Linux
+# distribution.
+ifeq ($(USE_PICOLIBC),1)
+LIBM_NAME := libc.a
+else
+LIBM_NAME := libm.a
+endif
LIBGCC_PATH := $(realpath $(shell $(CROSS)gcc $(CFLAGS) --print-libgcc-file-name))
-LIBM_PATH := $(realpath $(shell $(CROSS)gcc $(CFLAGS) --print-file-name=libm.a))
+LIBM_PATH := $(realpath $(shell $(CROSS)gcc $(CFLAGS) --print-file-name=$(LIBM_NAME)))
+ifeq ($(USE_PICOLIBC),1)
+ifeq ($(LIBM_PATH),)
+# The CROSS toolchain prefix usually ends with a dash, but that may not be
+# always the case. If the prefix ends with a dash it has to be taken out as
+# Picolibc's architecture directory won't have it in its name. GNU Make does
+# not have any facility to perform character-level text manipulation so we
+# shell out to sed.
+CROSS_PREFIX := $(shell echo $(CROSS) | sed -e 's/-$$//')
+PICOLIBC_ROOT ?= /usr/lib/picolibc/$(CROSS_PREFIX)/lib
+LIBM_PATH := $(PICOLIBC_ROOT)/$(PICOLIBC_ARCH)/$(PICOLIBC_ABI)/$(LIBM_NAME)
+endif
+endif
MPY_LD_FLAGS += $(addprefix -l, $(LIBGCC_PATH) $(LIBM_PATH))
endif Some assumptions are made, but they are at least documented. Is this approach good enough for this situation? If the path resolution issues I'm seeing depend on the compiler version then my workaround won't be used, and if this happens to be run on a system that uses a "broken" compiler but with Picolibc installed elsewhere this should still be recoverable somewhat - I assume the user will look at the Makefile to at least edit the hardcoded path, see the workaround, and act accordingly in that case. |
Yes, that looks acceptable for a temporary solution. It won't affect newlib (right?) so that's good. |
Yes, the workaround kicks in only if |
Merged via 16863, see commit 5197611 Thanks @vshymanskyy for this great addition, and @agatti for making the final steps! |
Cool, I was just coming to this forum to ask into implementing this functionality. Glad that someone's already done the work of building it, and that it's already been reviewed and committed. Before I start experimenting with it, can I just ask for someone more familiar with the changes to say: is it known how this will work with musl? (I'm using Alpine Linux/armv7.) Here's the issue that led me to hankering for this functionality. |
You may need to modify |
Summary
When building non-trivial native modules, the compiler runtime needs to be linked manually, which is a tedious process. This PR allows
mpy_ld
to automatically resolve dependencies and link withlibgcc.a
andlibm.a
(or other user-specified static libraries). It also improves reporting format ofmultiple definitions
andunresolved symbol
errors.Also, the automatic process ensures that only the required object files from archives are being linked into the mpy file. When doing it manually, one can include unneeded obj files which will just inflate the binary.
Loading large
.a
files takes quite some time, so the implementation caches the parsing results to minimize the impact on the developer experience (and the planet 🌱).MICROPY_ARCH_CFLAGS
was added so it can be conveniently used in theMakefile
(i.e. it can be used to cross-build third-party libs)Testing
This was developed as part of
wasm2mpy
initiative.As a result, I was able to produce builds without including the runtime object files in wasm2mpy repository:
https://github.com/vshymanskyy/wasm2mpy/actions/runs/10834266784/job/30063030978
Note: armv6m/coremark build fails due to unrelated reason1, reason2
Also,
examples/natmod/features2
was updated to use this functionality so it is included in the regular MicroPython CI tests.Also, this was successfully used by @agatti and @jonnor . See comments below.
Motivation
emlearn
Trade-offs and Alternatives
.a
, resolve dependencies, include required object files in the build processmpy_ld
gets a new (optional) dependency on ar package.ar
is not required unless you either setLINK_RUNTIME=1
, or pass-l
option to thempy_ld
.