Thanks to visit codestin.com
Credit goes to github.com

jasongief

[email protected]

APL Public
Forked from zhangbin-ai/APL

[2024 AAAI] Object-Aware Adaptive-Positivity Learning for Audio-Visual Question Answering

Python 1 Updated Mar 19, 2024
AVSBench Public
Forked from OpenNLPLab/AVSBench

[2022 ECCV] Audio-Visual Segmentation

Python 2 Apache License 2.0 Updated Sep 6, 2024
awesome-audiovisual-learning Public
Forked from GeWu-Lab/awesome-audiovisual-learning

A curated list of audio-visual learning methods and datasets.

Updated Jul 3, 2024
CPSP Public

[2023 TPAMI] Contrastive Positive Sample Propagation along the Audio-Visual Event Line

audio-visual-learning audio-visual-events audio-visual-parsing

Python 31 5 Updated Mar 6, 2023
FAVDBench Public
Forked from OpenNLPLab/FAVDBench

[CVPR 2023] Official implementation of the paper: Fine-grained Audible Video Description

Python Apache License 2.0 Updated Dec 4, 2023
LEAP Public

[2024 ECCV] Label-anticipated Event Disentanglement for Audio-Visual Video Parsing

Python 13 1 Updated Nov 17, 2024
Mettle Public

[2025 Arxiv] Mettle: Meta-Token Learning for Memory-Efficient Audio-Visual Adaptation

1 Updated Aug 5, 2025
OV-AVEL Public

[2025 CVPR] Towards Open-Vocabulary Audio-Visual Event Localization

Python 31 1 Updated Mar 7, 2025
PSP_CVPR_2021 Public

[2021 CVPR] Positive Sample Propagation along the Audio-Visual Event Line

audio-visual-learning audio-visual-events

Python 42 12 Updated Jul 5, 2022
TGS-Agent Public

[2025 Arxiv] Think Before You Segment: An Object-aware Reasoning Agent for Referring Audio-Visual Segmentation

audio-visual referring-video-object-segmentation

8 Updated Aug 13, 2025
video_features Public
Forked from v-iashin/video_features

Extract video features from raw videos using multiple GPUs. We support RAFT and PWC flow frames as well as I3D, R(2+1)D, VGGish, ResNet features.

Python GNU General Public License v3.0 Updated May 27, 2022