-
Zhejiang University
- [email protected]
Lists (6)
Sort Name ascending (A-Z)
Stars
The official GitHub repo for the survey paper "A Survey on Diffusion Language Models".
微舆:人人可用的多Agent舆情分析助手,打破信息茧房,还原舆情原貌,预测未来走向,辅助决策!从0实现,不依赖任何框架。
GPU-optimized framework for training diffusion language models at any scale. The backend of Quokka, Super Data Learners, and OpenMoE 2 training.
Official Implementation of "UniFlow: A Unified Pixel Flow Tokenizer for Visual Understanding and Generation"
A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts Interested in World Modeling.
This is the official repo for the paper "LongCat-Flash-Omni Technical Report"
NOFX: Defining the Next-Generation AI Trading Operating System. A multi-exchange Al trading platform(Binance/Hyperliquid/Aster) with multi-Ai competition(deepseek/qwen/claude)self-evolution, and re…
myscius / Awesome-Multimodal-Large-Language-Models
Forked from BradyFU/Awesome-Multimodal-Large-Language-Models✨✨Latest Advances on Multimodal Large Language Models
Native Multimodal Models are World Learners
"AI-Trader: Can AI Beat the Market?" Live Trading Bench: https://ai4trade.ai
🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
🛠️ DeepAgent: A General Reasoning Agent with Scalable Toolsets
OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.
A simple, unified multimodal models training engine. Lean, flexible, and built for hacking at scale.
EvoSyn: Generalizable Evolutionary Data Synthesis for Verifiable Learning
Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Give it a star 🌟 if you find it useful.
Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"
A simple yet powerful agent framework that delivers with open-source models
A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gemini 2 Flash.
About Official repo of paper "SRUM: Fine-Grained Self-Rewarding for Unified Multimodal Models". A post-training framework that creates a cost-effective, self-iterative optimization loop.
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…
Post-training with Tinker
This is the official Python version of Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play.