Real-time Shinkai-style neural webcam β point your camera at yourself, stream anime.
Powered by AnimeGANv3 Shinkai Β· ONNX Runtime Β· OpenVINO Β· Python
Your webcam feed β scaled down β AnimeGANv3 Shinkai model β scaled back up β you, but animated.
Camera 1280Γ720
β
βΌ scale to 256Γ144
InferenceThread βββ AnimeGANv3 Shinkai βββ upscale to 1280Γ720
β
βΌ
Display (always 30 fps, style updates ~14 fps)
The display thread and inference thread run independently β the window is always smooth even when the model is thinking.
# 1. Create environment
python -m venv .venv && source .venv/bin/activate
# 2. Install deps
pip install opencv-python onnxruntime onnxruntime-openvino numpy
# 3. Run
python main.pyControls
| Key | Action |
|---|---|
M |
Mirror flip |
ESC |
Quit |
All knobs are at the top of main.py:
| Variable | Default | Effect |
|---|---|---|
INFER_LONG |
256 |
Long-edge inference resolution. Raise to 384/512 for sharper style, costs FPS. |
BLEND_ALPHA |
0.95 |
1.0 = pure Shinkai Β· 0.0 = original feed |
CAMERA_ID |
0 |
Camera device index |
AnimeGANv3 Shinkai β trained to reproduce the painterly look of Makoto Shinkai's films (Your Name, Weathering With You).
- Format: ONNX, NHWC layout
- Input:
[1, H, W, 3]RGB normalised to[β1, 1] - Output:
[1, H, W, 3]same range, same dims - Dynamic spatial dims β accepts any resolution
- Size: 4.1 MB
- Source: TachibanaYoshino/AnimeGANv3
| Backend | Inference @ 256Γ144 |
|---|---|
| CPU (ORT) | ~8 fps |
| OpenVINO CPU | ~14 fps |
| Display thread | always 30 fps |
Stream the stylised feed directly into OBS, Discord, Zoom, or any app that accepts a webcam source via v4l2loopback.
sudo modprobe v4l2loopback devices=1 video_nr=10 \
card_label="AnimeCam" exclusive_caps=1
# Persist across reboots
echo "v4l2loopback" | sudo tee /etc/modules-load.d/v4l2loopback.conf
echo 'options v4l2loopback devices=1 video_nr=10 card_label="AnimeCam" exclusive_caps=1' \
| sudo tee /etc/modprobe.d/v4l2loopback.conf# Arch
yay -S v4l2loopback-dkms
# Ubuntu / Debian
sudo apt install v4l2loopback-dkmsimport fcntl, ctypes, os
V4L2_BUF_TYPE_VIDEO_OUTPUT = 2
V4L2_FIELD_NONE = 1
V4L2_PIX_FMT_BGR24 = 0x33524742
class v4l2_pix_format(ctypes.Structure):
_fields_ = [
("width", ctypes.c_uint32),
("height", ctypes.c_uint32),
("pixelformat", ctypes.c_uint32),
("field", ctypes.c_uint32),
("bytesperline", ctypes.c_uint32),
("sizeimage", ctypes.c_uint32),
("colorspace", ctypes.c_uint32),
("priv", ctypes.c_uint32),
]
class v4l2_format(ctypes.Structure):
_fields_ = [("type", ctypes.c_uint32), ("fmt", v4l2_pix_format)]
VIDIOC_S_FMT = 0xC0D05605
def open_vcam(device="/dev/video10", width=1280, height=720):
fd = open(device, "wb", buffering=0)
fmt = v4l2_format()
fmt.type = V4L2_BUF_TYPE_VIDEO_OUTPUT
fmt.fmt.width = width
fmt.fmt.height = height
fmt.fmt.pixelformat = V4L2_PIX_FMT_BGR24
fmt.fmt.field = V4L2_FIELD_NONE
fmt.fmt.bytesperline = width * 3
fmt.fmt.sizeimage = width * height * 3
fcntl.ioctl(fd, VIDIOC_S_FMT, fmt)
return fd
def write_vcam(fd, frame_bgr: np.ndarray) -> None:
buf = frame_bgr if frame_bgr.flags["C_CONTIGUOUS"] else np.ascontiguousarray(frame_bgr)
fd.write(buf.tobytes())Call write_vcam(fd, result) inside InferenceThread.run() after updating self._out_frame, or spin up a dedicated third thread that reads get_result() and writes to the device at display FPS.
The long-term goal is a self-contained C++ application that replaces the Python prototype entirely β no interpreter, no venv, one binary you drop anywhere and run.
Why C++
- Single statically-linked executable: copy to any Linux machine and it just works
- No Python runtime, no pip, no virtual environment management
- Lower latency: direct memory path from camera β model β v4l2 device
- Easier to package as an OBS plugin or systemd service
- Full control over threading, memory layout, and buffer lifetimes
Planned architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β animecam (single binary) β
β β
β Thread 1 β CameraCapture β
β V4L2 β mmap capture β raw BGR frames β
β β LatestFrame<cv::Mat> (lock-free slot, drop-old) β
β β
β Thread 2 β Inference (ncnn-vulkan or ORT) β
β BGR frame β resize β Shinkai model β
β β stylised BGR frame β
β β LatestFrame<cv::Mat> β
β β
β Thread 3 β VirtualCam output β
β stylised frame β VIDIOC_S_FMT β write() β /dev/videoXβ
β OBS / Discord / Zoom sees it as a regular webcam β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Key components to implement
| Component | Technology |
|---|---|
| Camera capture | V4L2 mmap buffers (zero-copy) |
| Inference backend | ncnn + Vulkan EP (AMD iGPU, 30+ fps) |
| Model format | ONNX β ncnn param/bin (FP16) |
| Virtual camera | v4l2loopback Β· VIDIOC_S_FMT Β· raw write() |
| Build system | CMake Β· static linking where possible |
| CLI | --device, --vcam, --infer-size, --blend flags |
Static binary checklist
# Link ncnn statically
set(NCNN_BUILD_SHARED_LIBS OFF)
# Link OpenCV statically (or use minimal subset)
set(BUILD_SHARED_LIBS OFF)
# Strip and compress final binary
set(CMAKE_EXE_LINKER_FLAGS "-static-libgcc -static-libstdc++")OBS integration path
Option A β Virtual webcam (simplest)
animecam writes to /dev/video10 β OBS adds it as "Video Capture Device"
Option B β OBS plugin (advanced)
Implement obs_source_t with get_frame() callback
animecam becomes a native OBS source plugin (.so)
Users install it from OBS β Tools β Scripts or plugin folder
Estimated milestone order
- Port Python pipeline to C++ with OpenCV + ncnn-vulkan (no virtual cam yet)
- Add v4l2loopback write β verify OBS sees the feed
- Replace Haar face detection with YuNet (OpenCV DNN, no extra deps)
- Static link and strip binary β verify it runs without any system libs beyond glibc
- Package as single
.tar.gzwith install script for modprobe persistence - (optional) OBS native plugin wrapper
anime-cam/
βββ main.py # entire pipeline
βββ requirements.txt
βββ models/
βββ AnimeGANv3_Shinkai.onnx # 4.1 MB
opencv-python
onnxruntime
onnxruntime-openvino # optional, ~30 % faster on CPU
numpy
Built for streamers. Runs on CPU. No GPU required.