Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View cyysc1998's full-sized avatar
  • Zhejiang University
  • Hang Zhou

Block or report cyysc1998

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Python 28,862 3,532 Updated Dec 5, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 18,803 3,131 Updated Jan 29, 2026

Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.

Python 13,091 1,233 Updated Jan 27, 2026

Provide with pre-build flash-attention package wheels on Linux and Windows platforms using GitHub Actions

Python 826 55 Updated Jan 28, 2026

Official implementation of UnifiedReward & [NeurIPS 2025] UnifiedReward-Think

Python 680 38 Updated Jan 24, 2026

A powerful tool for creating datasets for LLM fine-tuning 、RAG and Eval

JavaScript 12,988 1,280 Updated Jan 24, 2026
Python 9,604 602 Updated Jan 28, 2026

This project is the official implementation of 'DreamOmni2: Multimodal Instruction-based Editing and Generation''

Python 2,321 194 Updated Oct 20, 2025

[ICLR2026] This is the first paper to explore how to effectively use R1-like RL for MLLMs and introduce Vision-R1, a reasoning MLLM that leverages cold-start initialization and RL training to incen…

Python 753 20 Updated Jan 26, 2026

LLM中相关RLHF算法实现与学习

Python 12 3 Updated Apr 13, 2025

Visual Spatial Tuning

Jupyter Notebook 170 7 Updated Jan 9, 2026

Implementing DeepSeek R1's GRPO algorithm from scratch

Python 1,754 87 Updated Apr 18, 2025

The official repository of SEED-GRPO: Semantic Entropy Enhanced GRPO for Uncertainty-Aware Policy Optimization

Python 155 11 Updated Jan 29, 2026

HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation

Python 2,744 134 Updated Jan 28, 2026

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 17,691 2,238 Updated Feb 1, 2025

Scaling Vision Pre-Training to 4K Resolution

Python 221 11 Updated Jan 4, 2026

一款提示词优化器,助力于编写高质量的提示词

TypeScript 18,955 2,352 Updated Jan 29, 2026

[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL

Python 1,935 122 Updated Nov 4, 2025

Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis

Jupyter Notebook 642 27 Updated May 24, 2024

Reference PyTorch implementation and models for DINOv3

Jupyter Notebook 9,442 710 Updated Nov 20, 2025
Python 20 Updated Nov 21, 2025

An open-source AI agent that lives in your terminal.

TypeScript 17,901 1,563 Updated Jan 29, 2026

📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).

967 41 Updated Sep 27, 2025

[ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation

Python 133 11 Updated Sep 11, 2025

OmniGen2: Exploration to Advanced Multimodal Generation. https://arxiv.org/abs/2506.18871

Jupyter Notebook 4,014 17 Updated Dec 2, 2025

Enjoy the magic of Diffusion models!

Python 11,627 1,110 Updated Jan 27, 2026
Python 227 18 Updated Jul 17, 2025

Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.

Python 12,573 1,692 Updated Apr 7, 2025

[NeurIPS 2022] Official PyTorch implementation of Optimizing Relevance Maps of Vision Transformers Improves Robustness. This code allows to finetune the explainability maps of Vision Transformers t…

Jupyter Notebook 133 14 Updated Nov 22, 2022

A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gemini 2 Flash.

Python 2,129 89 Updated Dec 29, 2025
Next