-
Moonbility
- London, UK
-
06:32
(UTC -12:00) - https://kiranbaby14.github.io/Kiran-Portfolio/
- in/kiranbaby14
- https://dev.to/kiranbaby14
- https://devpost.com/kiranbaby14
Stars
Production-ready implementation of InvisPose - a revolutionary WiFi-based dense human pose estimation system that enables real-time full-body tracking through walls using commodity mesh routers
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
A real-time streaming conversational video system that transforms text interactions into continuous, high-fidelity video responses using autoregressive diffusion.
Open Vision Agents by Stream. Build Vision Agents quickly with any model or video provider. Uses Stream's edge network for ultra-low latency.
Speech-to-speech AI assistant with natural conversation flow, mid-speech interruption, vision capabilities and AI-initiated follow-ups. Features low-latency audio streaming, dynamic visual feedback…
Implementation of "Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length"
very good whiteboard infinite canvas SDK
👨‍🎨 The ergonomic way to storyboard. Turns sketches and annotations into videos by drawing on a canvas.
StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation
MotionStream: Real-Time Video Generation with Interactive Motion Controls
🎒 Token-Oriented Object Notation (TOON) – Compact, human-readable, schema-aware JSON for LLM prompts. Spec, benchmarks, TypeScript SDK.
"ViMax: Agentic Video Generation (Director, Screenwriter, Producer, and Video Generator All-in-One)"
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
StreamingVLM: Real-Time Understanding for Infinite Video Streams
A comprehensive toolkit for reliably locking, packing and deploying environments for ComfyUI workflows.
deepbeepmeep / Wan2GP
Forked from Wan-Video/Wan2.1A fast AI Video Generator for the GPU Poor. Supports Wan 2.1/2.2, Qwen Image, Hunyuan Video, LTX Video and Flux.
Reached #13 on Stanford's Terminal Bench leaderboard. Orchestrator, explorer & coder agents working together with intelligent context sharing.
AG-UI: the Agent-User Interaction Protocol. Bring Agents into Frontend Applications.
Kortix – build, manage and train AI Agents.
[NeurIPS 2025] Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Surpasses GPT-4o in ID persistence~ MoE ckpt released! Only 4GB VRAM is enough to run!
[CVPR 2025] Official PyTorch implementation of "EdgeTAM: On-Device Track Anything Model"
Whisper-Flow is a framework designed to enable real-time transcription of audio content using OpenAI’s Whisper model. Rather than processing entire files after upload (“batch mode”), Whisper-Flow a…
Lets make video diffusion practical!
đź’ˇ VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning
A CPU Realtime VLM in 500M. Surpassed Moondream2 and SmolVLM. Training from scratch with ease.