A C++ terminal client for interacting with a local Llama-based AI language model server, featuring context-aware conversations with memory persistence and performance metrics.
memoraxx is a lightweight, terminal-based application designed to interact with a local Llama 3.2 model server (via Ollama). It sends user prompts to the server, maintains conversational context using a memory system, and reports CPU usage and response times for each interaction. Key features include:
- Context-Aware Responses: Stores up to 5 recent prompt-response pairs in memory, persisted to
memory.jsonfor cross-session continuity. - Performance Metrics: Measures CPU usage (
getrusage) and response duration for each query. - Agent Capabilities: Supports tool calling for executing shell commands and extending functionality autonomously.
- User-Friendly Interface: Supports commands like
exit,quit, andclearwith fuzzy matching for typos (e.g.,quite→quit). - Robust JSON Handling: Uses
nlohmann/jsonfor reliable API communication.
- OS: macOS (tested on Sequoia with AppleClang 16.0.0), Linux (Ubuntu 20.04+), Windows (10+)
- Dependencies:
libcurl(e.g.,libcurl4-openssl-devon Ubuntu, included in macOS SDK)nlohmann/json(version ≥3.10)- CMake (version ≥3.10)
- C++20 compiler (e.g., AppleClang, GCC, MSVC)
- Ollama with Llama 3.2 model
- Optional:
libompfor future parallel processing (not currently required)
-
Clone the Repository:
git clone https://github.com/bniladridas/memoraxx.git cd memoraxx -
Install Dependencies:
macOS:
brew install curl nlohmann-json
Ubuntu:
sudo apt-get update sudo apt-get install libcurl4-openssl-dev nlohmann-json3-dev cmake build-essential
Windows (using vcpkg):
vcpkg install curl nlohmann-json
Ensure vcpkg is integrated with Visual Studio.
-
Install Ollama: Follow instructions at ollama.ai and pull the Llama 3.2 model:
ollama pull llama3.2
-
Build the Project: Use the build script:
./build.sh
[!NOTE] The build script automates the CMake configuration and build process.
Or manually:
mkdir build && cd build cmake .. cmake --build .
To build and run using Docker:
docker build -t memoraxx .
docker run -it memoraxxNote
This builds the application inside a container. For full functionality, Ollama must be running on the host or in a linked container.
-
Run Tests:
./e2e.sh
This runs end-to-end tests, including build verification and integration tests (requires Ollama running).
-
Install Dependencies (macOS):
brew install curl nlohmann-json
On Ubuntu:
sudo apt-get update sudo apt-get install libcurl4-openssl-dev nlohmann-json3-dev cmake build-essential
On Windows (using vcpkg):
vcpkg install curl nlohmann-json
Ensure vcpkg is integrated with Visual Studio.
-
Install Ollama: Follow instructions at ollama.ai and pull the Llama 3.2 model:
ollama pull llama3.2
-
Build the Project: Use the build script:
./build.sh
Or manually:
mkdir build && cd build cmake .. cmake --build .
-
Start the Ollama Server:
ollama serve
-
Run memoraxx:
./build/memoraxx
-
Interact:
- Enter prompts at the
>cursor.
- Enter prompts at the
- Use commands:
exitorquit: Exit the application.clear: Reset conversation memory.- Typos are handled (e.g.,
quite→quit).
- Agent Mode: Ask the AI to use tools, e.g., "Run the command 'ls'" to execute shell commands.
Example Interaction:
Waking up....
Welcome to memoraxx!
Ask anything. Type 'exit', 'quit', or 'clear' to manage memory.
> What is AI?
memoraxx is thinking...
--- AI Response ---
Artificial Intelligence (AI) is the simulation of human intelligence in machines...
-------------------
[memoraxx: brain active...]
[Sat Jul 26 02:45:00 2025, took 2.41715s, CPU usage: 123.456 ms]
> What's its history?
memoraxx is thinking...
--- AI Response ---
Building on our discussion about AI, its history began in the 1950s...
-------------------
[memoraxx: brain active...]
[Sat Jul 26 02:45:15 2025, took 2.83422s, CPU usage: 134.789 ms]
> Run the command 'echo hello'
memoraxx is thinking...
--- AI Response ---
Command output:
hello
-------------------
[memoraxx: brain active...]
[Sat Jul 26 02:45:30 2025, took 1.5s, CPU usage: 100.0 ms]
> quite
[memoraxx: shutting down...]
Exiting. Goodbye!
memoraxx can be configured via a config.json file in the project root. If the file is missing, default values are used.
Example config.json:
{
"base_url": "http://localhost:11434/api/generate",
"model": "llama3.2",
"max_tokens": 4096,
"memory_file": "memory.json"
}base_url: URL of the Ollama API endpoint.model: The model name to use (e.g., "llama3.2").max_tokens: Maximum number of tokens to store in memory.memory_file: Path to the file for persisting conversation memory.
- Memory System: Stores up to 5 interactions in
memory.jsonfor context-aware responses across sessions. - Performance Monitoring: Reports CPU usage (
getrusage) and response time for each query. - Error Handling: Robust cURL and JSON parsing with timeouts and HTTP status checks.
- User Experience: Loading animations, command suggestions, and graceful shutdown (Ctrl+C).
- main: Stable branch with memory persistence and CPU metrics.
- GPU usage measurement is not implemented (planned for future releases).
memory.jsonis plain text; encryption is recommended for sensitive data.
Follow the instructions at ollama.ai to install Ollama, then run ollama pull llama3.2 to get the model.
Use ./build.sh or follow the manual steps in Installation.
The app stores up to 5 recent interactions in memory.json for context-aware responses.
- Meta AI: For the Llama 3.2 model.
- Ollama: For the local model server.
- nlohmann/json: For robust JSON handling.
- libcurl: For HTTP communication.
Apache 2.0 License. See LICENSE for details.