Thanks to visit codestin.com
Credit goes to github.com

patrick-tssn

Follow

Yuxuan Wang patrick-tssn

Follow

No pride and no prejudice

94 followers · 395 following

Peking University
https://patrick-tssn.github.io

Achievements

Achievements

patrick-tssn/README.md

Hey There 🎸

🌱 I am Yuxuan Wang (汪宇轩), a research engineer at Qwen team, Alibaba. I completed my Master's degree at Peking University (PKU).
🔭 I am keen to explore “o” for “omni”.
🤝 I am continually open to all forms of collaborative opportunities.

Pinned Loading

Qwen3-Omni Qwen3-Omni Public

Forked from QwenLM/Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook
OmniMMI/M4 OmniMMI/M4 Public

[CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts

Python 12
bigai-nlco/VideoLLaMB bigai-nlco/VideoLLaMB Public

[ICCV 2025] Official Repository of VideoLLaMB: Long Video Understanding with Recurrent Memory Bridges

Python 77 2
VideoHallucer VideoHallucer Public

VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)

Python 38
bigai-nlco/VideoTGB bigai-nlco/VideoTGB Public

[EMNLP 2024] A Video Chat Agent with Temporal Prior

Python 32 3
VSTAR VSTAR Public

[ACL 2023] VSTAR is a multimodal dialogue dataset with scene and topic transition information

Python 15 2