Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View xingling0's full-sized avatar

Block or report xingling0

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks

Python 3,545 477 Updated Jan 10, 2026

本人的科研经验

9,748 523 Updated Jan 10, 2026

Official Repo for CVPR 2024 Paper "FACT: Frame-Action Cross-Attention Temporal Modeling for Efficient Fully-Supervised Action Segmentation"

Python 80 11 Updated Jan 10, 2026

A paper list of Awesome Latent Space.

285 9 Updated Jan 10, 2026

Holistic Evaluation of Language Models (HELM) is an open source Python framework created by the Center for Research on Foundation Models (CRFM) at Stanford for holistic, reproducible and transparen…

Python 2,617 347 Updated Jan 9, 2026

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 154,886 31,690 Updated Jan 9, 2026

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 65,394 7,946 Updated Jan 9, 2026

Scenic: A Jax Library for Computer Vision Research and Beyond

Python 3,746 468 Updated Jan 9, 2026

为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…

Python 69,941 8,403 Updated Jan 9, 2026

The official repository for the paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"

Jupyter Notebook 135 3 Updated Jan 9, 2026

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

Python 3,673 607 Updated Jan 9, 2026

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 26,106 1,836 Updated Jan 9, 2026

This is a collection of recent papers on reasoning in video generation models.

91 2 Updated Jan 8, 2026

解决Cursor在免费订阅期间出现以下提示的问题: Your request has been blocked as our system has detected suspicious activity / You've reached your trial request limit. / Too many free trial accounts used on this machine.

Shell 25,551 3,108 Updated Jan 8, 2026

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

Python 1,308 69 Updated Jan 8, 2026

A framework for few-shot evaluation of language models.

Python 11,145 2,951 Updated Jan 7, 2026

[Lumina Embodied AI] 具身智能技术指南 Embodied-AI-Guide

10,544 724 Updated Jan 7, 2026

一个基于nano banana pro🍌的原生AI PPT生成应用,迈向真正的"Vibe PPT"; 支持上传任意模板图片;上传任意素材&智能解析;一句话/大纲/页面描述自动生成PPT;口头修改指定区域、一键导出可编辑ppt - An AI-native PPT generator based on nano banana pro🍌

Python 9,232 996 Updated Jan 7, 2026

Official repo of "Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens"

Python 253 15 Updated Jan 6, 2026

Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline (CVPR 2023)

Python 70 6 Updated Jan 4, 2026

📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥

1,871 78 Updated Jan 4, 2026

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 17,681 1,504 Updated Jan 4, 2026

deep learning for image processing including classification and object-detection etc.

Python 25,981 8,257 Updated Jan 1, 2026

本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)

HTML 22,732 2,645 Updated Dec 30, 2025

Glance: Accelerating Diffusion Models with 1 Sample

Python 147 2 Updated Dec 24, 2025

Temporal Action Detection & Weakly Supervised Temporal Action Detection & Temporal Action Proposal Generation

566 41 Updated Dec 15, 2025
Python 5 Updated Dec 9, 2025

This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-based Reasoning MLLMs!

1,326 59 Updated Dec 7, 2025

Code for paper "Towards Better & Faster Autoregressive Image Generation: From the Perspective of Entropy" [NeurIPS 2025] .

Python 12 Updated Dec 6, 2025
Next