Thanks to visit codestin.com
Credit goes to Github.com

whmzfc

Follow

whmzfc

Follow

1 follower · 1 following

Stars

VectifyAI / PageIndex

📑 PageIndex: Document Index for Vectorless, Reasoning-based RAG

Python 10,498 763 Updated Jan 25, 2026

moltbot / moltbot

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 94,768 13,023 Updated Jan 29, 2026

stepfun-ai / gelab-zero

STEP-GUI: The top GUI agent solution in the galaxy. Developed by the StepFun-GELab team and powered by StepFun’s cutting-edge research capabilities.

Python 1,941 164 Updated Jan 23, 2026

openinterpreter / open-interpreter

A natural language interface for computers

Python 61,894 5,319 Updated Dec 5, 2025

bytedance / UI-TARS-desktop

The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra

TypeScript 24,987 2,410 Updated Jan 14, 2026

verl-project / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 18,792 3,128 Updated Jan 29, 2026

Tongyi-MAI / MAI-UI

MAI-UI: Real-World Centric Foundation GUI Agents ranging from 2B to 235B

Jupyter Notebook 1,582 163 Updated Jan 27, 2026

sail-sg / D-TRAK

Intriguing Properties of Data Attribution on Diffusion Models (ICLR 2024)

Jupyter Notebook 37 3 Updated Jan 23, 2024

shanface33 / AutoSplice_Dataset

AutoSplice: A Text-prompt Manipulated Image Dataset for Media Forensics, WMF@CVPR2023

50 Updated Jan 31, 2025

lucidrains / DALLE2-pytorch

Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch

Python 11,334 1,088 Updated May 11, 2024

Leon1207 / Video-RAG-master

✨✨[NeurIPS 2025] This is the official implementation of our paper "Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension"

Python 391 39 Updated Jan 14, 2026

woshidandan / Assessing-Image-Aesthetics-via-Multimodal-Large-Language-Models

🔥[AAAI 2026, Official Code] Regression Over Classification: Assessing Image Aesthetics via Multimodal Large Language Models. 克服大模型在美学评估过程中对分数不敏感的问题

Python 21 2 Updated Jan 27, 2026

chaofengc / Awesome-Image-Quality-Assessment

A comprehensive collection of IQA papers

TeX 1,450 84 Updated Dec 30, 2025

thunderbolt215 / UniPercept

UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture

Python 77 Updated Jan 21, 2026

LMMMEng / OverLoCK

[CVPR 2025 Oral] OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels

Python 503 51 Updated Dec 25, 2025

PKU-YuanGroup / MoE-LLaVA

【TMM 2025🔥】 Mixture-of-Experts for Large Vision-Language Models

Python 2,298 141 Updated Jul 15, 2025

wkentaro / labelme

Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).

Python 15,521 3,638 Updated Jan 28, 2026

cocodataset / cocoapi

COCO API - Dataset @ http://cocodataset.org/

Jupyter Notebook 6,356 3,761 Updated Apr 17, 2024

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 15,060 3,543 Updated Jan 29, 2026

openai / glide-text2im

GLIDE: a diffusion-based text-conditional image synthesis model

Python 3,683 500 Updated Mar 8, 2024

scu-zjz / Mesorch

[AAAI 2025] Official repository of paper “Mesoscopic Insights: Orchestrating Multi-scale & Hybrid Architecture for Image Manipulation Localization”

Python 92 4 Updated May 13, 2025

grip-unina / TruFor

TruFor

Python 231 30 Updated May 29, 2025

mjkwon2021 / CAT-Net

Official code for CAT-Net: Compression Artifact Tracing Network. Image manipulation detection and localization.

Python 292 32 Updated Jul 29, 2025

JingbiaoMei / RGCL

The official repo for RGCL:Improving Hateful Meme Detection through Retrieval-Guided Contrastive Learning and RA-HMD: Robust Adaptation of Large Multimodal Models for Retrieval Augmented Hateful Me…

Python 30 5 Updated Dec 29, 2025

huggingface / accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Python 9,474 1,273 Updated Jan 27, 2026

Visual-Agent / DeepEyes

Python 1,099 68 Updated Nov 20, 2025

2kxx / Q-Scorer

Optimizing MLLM-based Scoring via a Score-Token + Decoder Paradigm. This paper proposes a unified scoring paradigm for Multimodal Large Language Models (MLLMs).

Python 9 Updated Jan 15, 2026

ZiChao111 / FTI4CIR

Codes of the Fine-grained Textual Inversion network for Zero-Shot Composed Image Retrieval

Python 27 Updated Apr 7, 2025

thu-pacman / chitu

High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.

Python 1,395 95 Updated Jan 29, 2026

zhiyuanyou / DeQA-Score

[CVPR 2025] Teaching Large Language Models to Regress Accurate Image Quality Scores using Score Distribution

Python 216 4 Updated Dec 16, 2025