Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View arjun-kava's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report arjun-kava

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 48 3 Updated Oct 17, 2025

This project is the official implementation of 'DreamOmni2: Multimodal Instruction-based Editing and Generation''

Python 2,044 182 Updated Oct 20, 2025

[Preprint 2025] Ditto: Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset

Python 334 29 Updated Oct 22, 2025

Official implementation of OpenWBT.

Python 760 83 Updated Jul 30, 2025

[CoRL 2024] Open-TeleVision: Teleoperation with Immersive Active Visual Feedback

Python 1,091 119 Updated Sep 27, 2024

Real-Time and Accurate Full-Body Multi-Person Pose Estimation&Tracking System

Python 8,433 2,020 Updated May 13, 2024

A PyTorch native platform for training generative AI models

Python 4,596 571 Updated Oct 25, 2025

High-performance, semantic turn detection for conversational AI

Python 10 1 Updated Oct 1, 2025

FROM $f(x)$ AND $g(x)$ TO $f(g(x))$: LLMs Learn New Skills in RL by Composing Old Ones

Python 30 3 Updated Oct 17, 2025

The official PyTorch implementation of VM-ASR, a model designed for high-fidelity audio super-resolution.

Python 15 Updated Sep 8, 2025

The best way to get AI coding agents to solve hard problems in complex codebases.

TypeScript 6,462 498 Updated Oct 25, 2025

A single Gradio + React WebUI with extensions for ACE-Step, Kimi Audio, Piper TTS, GPT-SoVITS, CosyVoice, XTTSv2, DIA, Kokoro, OpenVoice, ParlerTTS, Stable Audio, MMS, StyleTTS2, MAGNet, AudioGen, …

TypeScript 2,676 282 Updated Oct 23, 2025

Audio Large Language Models

Python 762 38 Updated Jul 5, 2025

⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF + SafeTensors, hot model swap, auto-discovery, single binary. FREE now, FREE forever.

Rust 3,080 218 Updated Oct 23, 2025

A simple yet powerful agent framework that delivers with open-source models

Python 3,643 353 Updated Oct 24, 2025

NiceWebRL is a Python library for quickly making human subject experiments that leverage machine reinforcement learning environments.

Python 70 7 Updated Oct 9, 2025

Mobile-Agent: The Powerful GUI Agent Family

Python 6,108 609 Updated Oct 17, 2025

A compilation of the best multi-agent papers

TeX 968 79 Updated Oct 20, 2025

A Swift framework for real-time audio and video communication for iOS applications.

Objective-C 2 2 Updated Aug 18, 2025

Fast and local neural text-to-speech engine

C++ 1,322 148 Updated Sep 10, 2025

Official implementation of "WhisperNER: Unified Open Named Entity and Speech Recognition"

Python 196 8 Updated Feb 25, 2025
JavaScript 8 Updated Sep 3, 2025

[ICCV'2025 Highlight] MEMFOF: High-Resolution Training for Memory-Efficient Multi-Frame Optical Flow Estimation

Python 68 2 Updated Sep 29, 2025

A high-performance inference engine for AI models

Rust 1,345 34 Updated Oct 24, 2025

Build an AI Telephony Agent for Inbound and Outbound Calls

Python 224 22 Updated Sep 22, 2025

Update ASR paper everyday

Python 346 18 Updated Oct 25, 2025

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 18,940 1,867 Updated Oct 23, 2025

ncnn is a high-performance neural network inference framework optimized for the mobile platform

C++ 22,192 4,339 Updated Oct 23, 2025
Next