Thanks to visit codestin.com
Credit goes to Github.com

Skip to content
View whmzfc's full-sized avatar

Block or report whmzfc

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

📑 PageIndex: Document Index for Vectorless, Reasoning-based RAG

Python 10,498 763 Updated Jan 25, 2026

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 94,768 13,023 Updated Jan 29, 2026

STEP-GUI: The top GUI agent solution in the galaxy. Developed by the StepFun-GELab team and powered by StepFun’s cutting-edge research capabilities.

Python 1,941 164 Updated Jan 23, 2026

A natural language interface for computers

Python 61,894 5,319 Updated Dec 5, 2025

The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra

TypeScript 24,987 2,410 Updated Jan 14, 2026

verl: Volcano Engine Reinforcement Learning for LLMs

Python 18,792 3,128 Updated Jan 29, 2026

MAI-UI: Real-World Centric Foundation GUI Agents ranging from 2B to 235B

Jupyter Notebook 1,582 163 Updated Jan 27, 2026

Intriguing Properties of Data Attribution on Diffusion Models (ICLR 2024)

Jupyter Notebook 37 3 Updated Jan 23, 2024

AutoSplice: A Text-prompt Manipulated Image Dataset for Media Forensics, WMF@CVPR2023

50 Updated Jan 31, 2025

Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch

Python 11,334 1,088 Updated May 11, 2024

✨✨[NeurIPS 2025] This is the official implementation of our paper "Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension"

Python 391 39 Updated Jan 14, 2026

🔥[AAAI 2026, Official Code] Regression Over Classification: Assessing Image Aesthetics via Multimodal Large Language Models. 克服大模型在美学评估过程中对分数不敏感的问题

Python 21 2 Updated Jan 27, 2026

A comprehensive collection of IQA papers

TeX 1,450 84 Updated Dec 30, 2025

UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture

Python 77 Updated Jan 21, 2026

[CVPR 2025 Oral] OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels

Python 503 51 Updated Dec 25, 2025

【TMM 2025🔥】 Mixture-of-Experts for Large Vision-Language Models

Python 2,298 141 Updated Jul 15, 2025

Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).

Python 15,521 3,638 Updated Jan 28, 2026

COCO API - Dataset @ http://cocodataset.org/

Jupyter Notebook 6,356 3,761 Updated Apr 17, 2024

Ongoing research training transformer models at scale

Python 15,060 3,543 Updated Jan 29, 2026

GLIDE: a diffusion-based text-conditional image synthesis model

Python 3,683 500 Updated Mar 8, 2024

[AAAI 2025] Official repository of paper “Mesoscopic Insights: Orchestrating Multi-scale & Hybrid Architecture for Image Manipulation Localization”

Python 92 4 Updated May 13, 2025

TruFor

Python 231 30 Updated May 29, 2025

Official code for CAT-Net: Compression Artifact Tracing Network. Image manipulation detection and localization.

Python 292 32 Updated Jul 29, 2025

The official repo for RGCL:Improving Hateful Meme Detection through Retrieval-Guided Contrastive Learning and RA-HMD: Robust Adaptation of Large Multimodal Models for Retrieval Augmented Hateful Me…

Python 30 5 Updated Dec 29, 2025

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Python 9,474 1,273 Updated Jan 27, 2026
Python 1,099 68 Updated Nov 20, 2025

Optimizing MLLM-based Scoring via a Score-Token + Decoder Paradigm. This paper proposes a unified scoring paradigm for Multimodal Large Language Models (MLLMs).

Python 9 Updated Jan 15, 2026

Codes of the Fine-grained Textual Inversion network for Zero-Shot Composed Image Retrieval

Python 27 Updated Apr 7, 2025

High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.

Python 1,395 95 Updated Jan 29, 2026

[CVPR 2025] Teaching Large Language Models to Regress Accurate Image Quality Scores using Score Distribution

Python 216 4 Updated Dec 16, 2025
Next