- Bangalore
- https://videosdk.live/
- @Arjun_Kava
- in/arjun-kava
Lists (8)
Sort Name ascending (A-Z)
Stars
- All languages
- Assembly
- Bikeshed
- C
- C#
- C++
- CMake
- CSS
- Clojure
- Common Lisp
- Cuda
- Cython
- Dart
- Dockerfile
- Elixir
- Erlang
- Go
- HCL
- HTML
- Handlebars
- Haskell
- JSON
- Java
- JavaScript
- Jupyter Notebook
- Kotlin
- LLVM
- Lua
- MATLAB
- MDX
- MLIR
- Makefile
- Markdown
- OCaml
- Objective-C
- Objective-C++
- OpenEdge ABL
- PHP
- PLpgSQL
- Protocol Buffer
- Python
- Ruby
- Rust
- SCSS
- Scala
- Shell
- Solidity
- Svelte
- Swift
- TeX
- TypeScript
- V
- Vim Script
- Vue
- WebAssembly
- YAML
- Zig
This project is the official implementation of 'DreamOmni2: Multimodal Instruction-based Editing and Generation''
[Preprint 2025] Ditto: Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset
Official implementation of OpenWBT.
[CoRL 2024] Open-TeleVision: Teleoperation with Immersive Active Visual Feedback
Real-Time and Accurate Full-Body Multi-Person Pose Estimation&Tracking System
A PyTorch native platform for training generative AI models
High-performance, semantic turn detection for conversational AI
FROM $f(x)$ AND $g(x)$ TO $f(g(x))$: LLMs Learn New Skills in RL by Composing Old Ones
The official PyTorch implementation of VM-ASR, a model designed for high-fidelity audio super-resolution.
The best way to get AI coding agents to solve hard problems in complex codebases.
A single Gradio + React WebUI with extensions for ACE-Step, Kimi Audio, Piper TTS, GPT-SoVITS, CosyVoice, XTTSv2, DIA, Kokoro, OpenVoice, ParlerTTS, Stable Audio, MMS, StyleTTS2, MAGNet, AudioGen, …
⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF + SafeTensors, hot model swap, auto-discovery, single binary. FREE now, FREE forever.
A simple yet powerful agent framework that delivers with open-source models
NiceWebRL is a Python library for quickly making human subject experiments that leverage machine reinforcement learning environments.
Mobile-Agent: The Powerful GUI Agent Family
A compilation of the best multi-agent papers
A Swift framework for real-time audio and video communication for iOS applications.
Fast and local neural text-to-speech engine
Official implementation of "WhisperNER: Unified Open Named Entity and Speech Recognition"
[ICCV'2025 Highlight] MEMFOF: High-Resolution Training for Memory-Efficient Multi-Frame Optical Flow Estimation
Build an AI Telephony Agent for Inbound and Outbound Calls
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
ncnn is a high-performance neural network inference framework optimized for the mobile platform