Tags: shards-lang/llama.cpp
Tags
server : fix format_infill (ggml-org#10724) * server : fix format_infill * fix * rename * update test * use another model * update test * update test * test_invalid_input_extra_req
server : bring back info of final chunk in stream mode (ggml-org#10722) * server : bring back into to final chunk in stream mode * clarify a bit * traling space
llama : use cmake for swift build (ggml-org#10525) * llama : use cmake for swift build * swift : <> -> "" * ci : remove make * ci : disable ios build * Revert "swift : <> -> """ This reverts commit d39ffd9. * ci : try fix ios build * ci : cont * ci : cont --------- Co-authored-by: Georgi Gerganov <[email protected]>
server : (refactor) no more json in server_task input (ggml-org#10691) * server : (refactor) no more json in server_task input * add test for slots endpoint * add tests for /props and /slots * remove task inf_type * fix CI by adding safe_json_to_str * add "model_path" to /props * update readme
ggml : disable iq4_nl interleave size 8 (ggml-org#10709) ggml-ci
PreviousNext