Highlights
ai
Collection of leaked system prompts
한국어 음성인식 STT API 리스트. 각 성능 벤치마크.
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Stable Diffusion web UI
Generative Models by Stability AI
Foundational model for human-like, expressive TTS
A multi-voice TTS system trained with an emphasis on quality
Robust Speech Recognition via Large-Scale Weak Supervision
Visualize Different Text Splitting Methods
Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
The fastai book, published as Jupyter Notebooks
This repository contains the ML For Games Course
A curated list of awesome AI tools for game developers
Refine high-quality datasets and visual AI models
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.
『ゼロから作る Deep Learning ❺』(O'Reilly Japan, 2024)
Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation
Production-ready platform for agentic workflow development.
aider is AI pair programming in your terminal
🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23
Langflow is a powerful tool for building and deploying AI-powered agents and workflows.
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data…
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal is…
An open-source RAG-based tool for chatting with your documents.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)