Video Point Tracker

Local-first video point tracking with a React frontend, an Express/BullMQ backend, ffmpeg-based video processing, and OpenAI-compatible multimodal model inference through Ollama, LM Studio, or llama.cpp.

What it does

Upload a local video file.
Choose a local provider and vision-capable model.
Set a tracking target such as ball, hand, or player.
Sample frames at a configurable FPS and send them to a local multimodal endpoint.
Receive normalized 2D points per frame.
Preview the track over the original video in the browser.
Download a rendered tracked MP4 and the raw JSON result.

UI Preview

Person Tracking

Robot Arm Pick Tracking

Stack

Frontend: React 18, TypeScript, Vite, Zustand, Axios
Backend: Node 20, Express 4, BullMQ, Redis, OpenAI SDK, fluent-ffmpeg, Zod, Winston
Infra: Docker, Docker Compose, nginx, Redis

Repository layout

.
├── backend
├── frontend
├── nginx
├── docker-compose.yml
├── docker-compose.dev.yml
└── .env.example

Quickstart

Ollama (recommended)

ollama pull llava

cp .env.example .env
docker compose up --build

Open http://localhost:3000.

LM Studio

Open LM Studio.
Load a vision-capable model and keep it loaded.
Start the local OpenAI-compatible server on port 1234.

cp .env.example .env

Edit .env and set:

LLM_PROVIDER=lmstudio

Then run:

docker compose up --build

Notes:

The backend uses LM Studio JSON-schema output when the loaded model supports it, which improves coordinate parsing reliability.
If a loaded LM Studio vision model has a custom ID that does not match the usual vision-name heuristics, the app now falls back to showing all loaded LM Studio models instead of hiding them.

llama.cpp

Start the multimodal server first:

./llava-server \
  -m llava-v1.6-mistral-7b.gguf \
  --mmproj mmproj-model-f16.gguf \
  --port 8080 \
  --host 0.0.0.0

Then:

cp .env.example .env

Edit .env and set:

LLM_PROVIDER=llamacpp

Run:

docker compose up --build

Development mode

Start the Redis-backed backend and Vite frontend with hot reload:

cp .env.example .env
docker compose -f docker-compose.yml -f docker-compose.dev.yml up --build

Frontend dev server: http://localhost:5173
Backend API: http://localhost:4000
Full proxied app: http://localhost:3000

API surface

POST /api/track
GET /api/track/progress/:jobId
GET /api/track/result/:jobId
GET /api/track/download/:jobId/:filename
GET /api/models?provider=ollama
GET /api/health

Environment variables

See .env.example for the full list. The key values are:

LLM_PROVIDER
OLLAMA_BASE_URL
LMSTUDIO_BASE_URL
LLAMACPP_BASE_URL
MAX_UPLOAD_MB
MAX_VIDEO_SECS
QUEUE_CONCURRENCY
FRAME_TMP_DIR

Notes

The in-browser result player overlays points on the original uploaded clip for immediate inspection.
The backend also renders a downloadable tracked MP4 using ffmpeg drawbox filters.
In Docker, the backend prefers system ffmpeg and ffprobe binaries; FFMPEG_PATH and FFPROBE_PATH can override detection if needed.
When the SSE client disconnects, the backend aborts the active tracking job and cleans up runtime artifacts.
Completed job artifacts are scheduled for deletion 30 minutes after completion.

Validation

The frontend production build and backend TypeScript build are part of the implementation workflow. The final Compose integration still depends on a locally available multimodal provider and a real sample video.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video Point Tracker

What it does

UI Preview

Person Tracking

Robot Arm Pick Tracking

Stack

Repository layout

Quickstart

Ollama (recommended)

LM Studio

llama.cpp

Development mode

API surface

Environment variables

Notes

Validation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
backend		backend
frontend		frontend
nginx		nginx
sample_ui		sample_ui
.env.example		.env.example
LICENSE		LICENSE
README.md		README.md
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation

Video Point Tracker

What it does

UI Preview

Person Tracking

Robot Arm Pick Tracking

Stack

Repository layout

Quickstart

Ollama (recommended)

LM Studio

llama.cpp

Development mode

API surface

Environment variables

Notes

Validation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages