Codestin Search App

Thanks to visit codestin.com
Credit goes to github.com

Roadmap for v0.2
#487 · abetlen opened on Jul 18, 2023
1
Add batched inference
#771 · abetlen opened on Sep 30, 2023
37
Improve installation process
#1178 · abetlen opened on Feb 12, 2024
8

Labels Milestones New issue

unknown model architecture: 'gemma-embedding'

#2065

· mariocannistra opened

on Sep 5, 2025

llama_get_kv_self debug symbols removed

#2064

· Bread7 opened

on Sep 4, 2025

Thinking toggle support for Qwen related models

#2063

· Kishlay-notabot opened

on Sep 1, 2025

ggml_cuda_init: failed to initialize CUDA: (null) on Windows with CUDA 12.9

#2062

· sequeirawilson2021 opened

on Aug 31, 2025

ERROR installing v0.3.16 with CUDA enabled on docker

#2061

· arditobryan opened

on Aug 29, 2025

[Bug Report] Severe VRAM Allocation Instability in PyTorch after llama-cpp-python is Imported

#2060

· rookiestar28 opened

on Aug 28, 2025

Cannot install current version of llama-cpp-python 0.3.16 on Windows (backend independent)

#2057

· devtobi opened

on Aug 18, 2025

cannot run fine-tuned gpt-oss model correctly

#2054

· jiachenguoNU opened

on Aug 16, 2025

Adding Audio capabilities

#2052

· haixuanTao opened

on Aug 13, 2025

Can't compute multiple embeddings in a single call

#2051

· jeberger opened

on Aug 8, 2025

Can't disable CMAKE ARG on Apple: GGML_METAL=OFF

#2050

· brendensoares opened

on Aug 8, 2025

Regression in unified KV cache appears after <code>llama.cpp</code> release b5912 in b5913

#2045

· akarasulu opened

on Jul 24, 2025