Lists (31)
Sort Name ascending (A-Z)
3D
Agent
AGI
Artificial General IntelligenceAIGC
Architecture
ChatGPT
CLIP
Data
DiffusionModel
Docs
Face
Face related tasksGeneral Tasks
Generative AI
Human
Image Editing
LLM+LMM
Low Level Vision
MultiModality
NeuralRender
NeuralStyle
NLP
Others
Paper Collection
RemoteSense
RL
Robots
StyleGAN
stylegan related worksTools
Video
Visual Quality Assessment
VQGAN
Stars
[AAAI 2026] Beyond Cosine Similarity: Magnitude-Aware CLIP for No-Reference Image Quality Assessment
《机器学习》(西瓜书)代码实战
李宏毅2021/2022/2023春季机器学习课程课件及作业
Specification and documentation for Agent Skills
Overcoming False Illusions in Real-World Face Restoration with Multi-Modal Guided Diffusion Model [ICLR 2025]
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.
The paper list of "Memory in the Age of AI Agents: A Survey"
一个基于nano banana pro🍌的原生AI PPT生成应用,迈向真正的"Vibe PPT"; 支持上传任意模板图片;上传任意素材&智能解析;一句话/大纲/页面描述自动生成PPT;口头修改指定区域、一键导出 - An AI-native PPT generator based on nano banana pro🍌
<Foundations of Computer Vision> Book
TurboDiffusion: 100–200× Acceleration for Video Diffusion Models
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
HY-World 1.5: A Systematic Framework for Interactive World Modeling with Real-Time Latency and Geometric Consistency
What are the principles we can use to build LLM-powered software that is actually good enough to put in the hands of production customers?
An AI agent that automates the creation of CVPR/NeurIPS standard academic diagrams. Implements a strict "Logic (Architect) -> Vision (Renderer)" workflow to transform paper abstracts into high-fide…
《深度学习入门-基于Python的理论与实现》,包含源代码和高清PDF(带书签);慕课网imooc《深度学习之神经网络(CNN-RNN-GAN)算法原理-实战》;《菜菜的机器学习sklearn》
Up-to-date version of labs for ISLP
"DeepCode: Open Agentic Coding (Paper2Code & Text2Web & Text2Backend)"
A next.js web application that integrates AI capabilities with draw.io diagrams. This app allows you to create, modify, and enhance diagrams through natural language commands and AI-assisted visual…
Official implementation for What matters for Representation Alignment: Global Information or Spatial Structure?
Towards Real-Time Diffusion-Based Streaming Video Super-Resolution — An efficient one-step diffusion framework for streaming VSR with locality-constrained sparse attention and a tiny conditional de…
A list of works on evaluation of visual generation models, including evaluation metrics, models, and systems
Code of π^3: Permutation-Equivariant Visual Geometry Learning
Paper Debugger is the best overleaf companion
Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation
LLM Council works together to answer your hardest questions