- 🌱 I am Yuxuan Wang (汪宇轩), a research engineer at Qwen team, Alibaba. I completed my Master's degree at Peking University (PKU).
- 🔭 I am keen to explore “o” for “omni”.
- 🤝 I am continually open to all forms of collaborative opportunities.
-
Peking University
- https://patrick-tssn.github.io
Pinned Loading
-
Qwen3-Omni
Qwen3-Omni PublicForked from QwenLM/Qwen3-Omni
Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.
Jupyter Notebook
-
OmniMMI/M4
OmniMMI/M4 Public[CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts
Python 12
-
bigai-nlco/VideoLLaMB
bigai-nlco/VideoLLaMB Public[ICCV 2025] Official Repository of VideoLLaMB: Long Video Understanding with Recurrent Memory Bridges
-
VideoHallucer
VideoHallucer PublicVideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)
Python 38
-
bigai-nlco/VideoTGB
bigai-nlco/VideoTGB Public[EMNLP 2024] A Video Chat Agent with Temporal Prior
If the problem persists, check the GitHub status page or contact support.