-
Northwestern Polytechnical University
- Xi'an, Shaanxi, China
-
18:00
(UTC +08:00)
Highlights
- Pro
Stars
This is the code repo for the paper VERA: Explainable Video Anomaly Detection via Verbalized Learning of Vision-Language Models (CVPR 2025).
A curated list of works related to Misinformation Video Detection, as a companion material for an ACM Multimedia 2023 survey
Official repository for "FakeSV: A Multimodal Benchmark with Rich Social Context for Fake News Detection on Short Video Platforms", AAAI 2023.
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
"AutoAgent: Fully-Automated and Zero-Code LLM Agent Framework"
A curated list of papers & resources on anomaly detection foundation models using large language model, vision-language model, graph foundation model, time series foundation model, etc
小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫、百度贴吧帖子 | 百度贴吧评论回复爬虫 | 知乎问答文章|评论爬虫
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。
✨✨[NeurIPS 2025] This is the official implementation of our paper "Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension"
⚡️SwanLab - an open-source, modern-design AI training tracking and visualization tool. Supports Cloud / Self-hosted use. Integrated with PyTorch / Transformers / LLaMA Factory / veRL/ Swift / Ultra…
Official implementation of "Harnessing Large Language Models for Training-free Video Anomaly Detection", CVPR 2024
A python module to scrape arxiv.org for a date range and category
[CVPR 2025] Crab: A Unified Audio-Visual Scene Understanding Model with Explicit Cooperation
『ゼロから作る Deep Learning』(O'Reilly Japan, 2016)
《深度学习入门:基于Python的理论与实现》电子版及配套代码。
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…
[NeurIPS 2025 D&B] Open-source Multi-agent Poster Generation from Papers
Deep Learning Book Chinese Translation
Public facing notes page
一个免费的公式智能识别软件 A free intelligent formula recognition software