🪢 Knot

Knot is my client-server TUI for running LLMs locally.

Convos automatically saved to SQLite DB.
Install models to run locally.
Markdown rendering, tables, syntax highlighting, etc.
Load .md files into chat context.
Generate and download summaries of a given conversation.

See the TODO section at the bottom of the README for known errors and future improvments.

Details

Knot consists of two component that run simultaneously, server.py, the inference server which uses llama-cpp-python, and knot.py, the TUI client that renders the streaming response.

Engine: Llama-cpp-python (python bindings for llama.cpp),
UI: Rich and Prompt Toolkit,
SQLite DB

Installation

To get started follow the below.

Note: Currently optimized for Apple Silicon (M1/M2/M3) with Metal GPU acceleration.

1. Clone & prepare

Clone the repository and create your virtual environment:

# Create the project folder
mkdir knot
cd knot

# Create your venv
python3 -m venv knot
source knot/bin/activate

Note: If you don't have it already, this will create a folder titled convo in your project root as well as a SQLite DB in the folder for your conversation history.

2. Install engine with Metal support

Compile llama-cpp-python with Metal support to use Mac GPU:

CMAKE_ARGS="-DGGML_METAL=on" pip install llama-cpp-python

Note: If you are on Linux/Windows, you should be able to just omit the CMAKE_ARGS part.

3. Install dependancies

UI and utility libs need installation still:

pip install rich prompt_toolkit huggingface_hub

4. Use the thing

WIP still so lots for me to fix but can be played around with now.

Note: By default you will have Phi 3 mini ready to be downloaded upon running for the first time. ~sub-2.5GB in size, but you can enter details to load a different model.

Run the server in the background and the client in the foreground:

# In one terminal tab
python3 server.py

# In your second terminal tab
python3 knot.py

Note: It's cumbersome to cd into your directory, activate your venv and run two python files whenever you want to use knot. I've created a custom command by adding an alias to my shell config file so that I can run knot from anywhere in my terminal to automatically activate the venv and launch the application. Example:

#!/bin/bash

cd /{YOUR_FILE_PATH}/knot || exit

source {YOUR_VENV_NAME}/bin/activate || exit

cleanup() {
    kill $SERVER_PID 2>/dev/null
}

trap cleanup EXIT INT TERM

python3 server.py > server.log 2>&1 &

SERVER_PID=$!

sleep 1

python3 knot.py

Make this executable and alias it to knot in your shell config.

Comand reference

Type normally to chat or start a line with : to enter a command. Quick overview:

Command	Action
`:new`	Start a new conversation and clear the current context
`:history`	List past conversations
`:open <id>`	Open a conversation by its partial ID
`:delete <id>`	Delete a conversation permenantly
`:load <file>`	Load a text/md file as context
`:summary`	Save a summary of this chat to Downloads
`:search <h/d/w> <term>`	Search conversation history (h), device (d), or web URLs (w)
`:ask <question>`	Web RAG Search
`:job <cmd>`	Assign tasks to models (list, set summary, set title, set ask)
`:model <cmd>`	Manage active / downloaded models (add, select, list)
`:quit`	Exit Knot
`:cot <on/off>`	Toggle display of reasoning/thoughts
`:help`	View possible commands

To set a model's job using the :job command, use :job set <task> <model_ID>. Currently, the two tasks available for designating models to are summary (ie. the :summary command) and title (ie. generating a title for the conversation). For example:

job set title 1 ensures all titles are generated using the model with the ID of 1.
job set summary 2 ensures all conversations are summarized using then model with the ID of 2.

Note: I would currently reccomend using a non-CoT model for these jobs (see known errors).

TODO:

Known errors

:summary command sometimes doesn't work well for GPT OSS converations due to CoT.
Height gets fixed/standard terminal scrolling gets locked on some long answers. Think this is a limitation of Rich, need to look into it.

Future improvements

Add ability to "branch" a new conversation from any previous message.
Need to explore most expedient way to display maths/proofs, etc.
Explore possibility of web search and/or search over local documents.
Set path for accessing models, DB, summary export, etc. in app.

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
img		img
.gitignore		.gitignore
README.md		README.md
knot.py		knot.py
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🪢 Knot

Details

Installation

1. Clone & prepare

2. Install engine with Metal support

3. Install dependancies

4. Use the thing

Comand reference

TODO:

Known errors

Future improvements

About

Uh oh!

Packages

Languages

gregorycotton/knot

Folders and files

Latest commit

History

Repository files navigation

🪢 Knot

Details

Installation

1. Clone & prepare

2. Install engine with Metal support

3. Install dependancies

4. Use the thing

Comand reference

TODO:

Known errors

Future improvements

About

Resources

Uh oh!

Stars

Watchers

Forks

Packages 0

Languages

Packages