Description
Prerequisites
- I am running the latest code. Mention the version if possible as well.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new and useful enhancement to share.
Feature Description
Background
I have a fork of llama.cpp for building cross-CPU / cross-platform binaries with Cosmopolitan. Think of it like llamafile, except llama.cpp is up-to-date (weekly or more). I use a modified old Makefile to build things, as I was unable to come up with an architecture for CMake that works with cosmo.
https://github.com/BradHutchings/llama-server-one
At compile time, cosmo wants to build an ARM binary and an x86 binary to merge together on my build system. This precludes using any of the GPU optimizations and it's become a little awkward with the CPU optimizations.
I work around this by making a generic CPU architecture for cosmo to compile:
mkdir -p ggml/src/ggml-cpu/arch/cosmo
cp ggml/src/ggml-cpu/repack.cpp ggml/src/ggml-cpu/arch/cosmo/
cp ggml/src/ggml-cpu/quants.c ggml/src/ggml-cpu/arch/cosmo/
sed -i -e "s/_generic//g" ggml/src/ggml-cpu/arch/cosmo/repack.cpp
sed -i -e "s/_generic//g" ggml/src/ggml-cpu/arch/cosmo/quants.c
This works great right now with how ggml-cpu has been reorganized. I'm hoping that a generic architecture that just calls back into the _generic
functions in ggml/src/ggml-cpu/repack.cpp
and ggml/src/ggml-cpu/quants.c
can be added to ggml-cpu/arch
. A subfolder, ggml-cpu/arch/generic
seems appropriate.
Motivation
Building llama.cpp with Cosmopolitan.
Possible Implementation
I'm hoping that a generic architecture that just calls back into the _generic
functions in ggml/src/ggml-cpu/repack.cpp
and ggml/src/ggml-cpu/quants.c
can be added to ggml-cpu/arch
. A subfolder, ggml-cpu/arch/generic
seems appropriate.