This example simply performs a matrix multiplication, as shown below:
- Simply put
add_subdirectory(ggml-simple)to end ofexamples/CMakeLists.txtin llama.cpp project.
...
add_subdirectory(tokenize)
add_subdirectory(train-text-from-scratch)
add_subdirectory(ggml-simple) <-- HERE!
endif()
- Build llama.cpp project using
CMake:
$ cmake -B bld
if you have CUDA enabled device, you can build CUDA version:
$ cmake -B bld -DGGML_CUDA=on
$ make ggml-simpleinblddirectory.- Launch
./bin/ggml-simple!
$ ./bin/ggml-simple
main: using Metal backend
ggml_metal_init: allocating
ggml_metal_init: found device: Apple M2
ggml_metal_init: picking default device: Apple M2
ggml_metal_init: using embedded metal library
ggml_metal_init: GPU name: Apple M2
ggml_metal_init: GPU family: MTLGPUFamilyApple8 (1008)
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3 (5001)
ggml_metal_init: simdgroup reduction support = true
ggml_metal_init: simdgroup matrix mul. support = true
ggml_metal_init: hasUnifiedMemory = true
ggml_metal_init: recommendedMaxWorkingSetSize = 17179.89 MB
main: compute buffer size: 0.0625 KB
mul mat (4 x 3):
[ 60.00 90.00 42.00
55.00 54.00 29.00
50.00 54.00 28.00
110.00 126.00 64.00 ]
ggml_metal_free: deallocating