Thanks to visit codestin.com
Credit goes to github.com

Skip to content

LLM Server created for Raymond Maarloeve

RaymondMaarloeve/LLMServer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Raymond Maarloeve LLMServer

Official language model (LLM) server for the narrator and NPCs in the Raymond Maarloeve game project.

Python
API
Model
Platform
Docs

A lightweight REST API for managing local language models used by NPCs and the narrator in the game. Supports multiple model loading, response generation, and dynamic resource management.

📚 Documentation

Full project documentation is available at:
🔗 https://raymondmaarloeve.github.io/LLMServer/
Main repo: 🔗 https://github.com/RaymondMaarloeve/RaymondMaarloeve

✨ Features

  • 🔁 Supports multiple LLMs simultaneously (model_id)
  • 🔌 Simple /chat endpoint with full conversation history handling
  • 🚦 Automatic response termination detection using special tags (<npc>, <human>, etc.)
  • 🧹 Ability to unload models from memory (/unload)
  • 📂 File browsing via API (/list-files)

🧩 Technologies

🚀 Usage

  1. Run the server:

    python main.py
  2. Load a model:

    POST /load
    {
      "model_id": "npc_village",
      "model_path": "models/ggml-npc-q4.bin",
      "n_ctx": 2048,
      "n_gpu_layers": 16
    }
  3. Send a chat request:

    POST /chat
    {
      "model_id": "npc_village",
      "messages": [
        {"role": "system", "content": "You are a grumpy blacksmith."},
        {"role": "user", "content": "Hello there!"},
        {"role": "assistant", "content": "Hmph. What do you want?"},
        {"role": "user", "content": "Got any gossip?"}
      ]
    }
  4. Receive the response and display it in-game.

🛠 Building

To build a standalone version:

 CMAKE_ARGS="-DGGML_VULKAN=on" uv pip install llama-cpp-python --no-cache
 uv run pyinstaller --onefile --additional-hooks-dir hooks main.py

🔍 API Endpoints

Endpoint Description
/load Load a model into memory
/chat Generate a response in chat style
/unload Release model resources
/status Check available models and GPU status
/list-files List files in a specified directory
/register Register a model for lazy-loading

The LLMServer project is the foundation of narration and NPC behavior in the world of Raymond Maarloeve.

About

LLM Server created for Raymond Maarloeve

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages