Thanks to visit codestin.com
Credit goes to Github.com

Skip to content
Change the repository type filter

All

    Repositories list

    • MokA

      Public
      MokA: Multimodal Low-Rank Adaptation for MLLMs
      Python
      460110Updated Dec 30, 2025Dec 30, 2025
    • Crab

      Public
      [CVPR 2025] Crab: A Unified Audio-Visual Scene Understanding Model with Explicit Cooperation
      Python
      28040Updated Dec 24, 2025Dec 24, 2025
    • A curated list of balanced multimodal learning methods.
      514710Updated Dec 22, 2025Dec 22, 2025
    • This is the repo for "Adaptive Unimodal Regulation for Balanced Multimodal Information Acquisition", CVPR2025.
      Python
      51940Updated Dec 22, 2025Dec 22, 2025
    • JavaScript
      0000Updated Oct 26, 2025Oct 26, 2025
    • Python
      2720Updated Oct 26, 2025Oct 26, 2025
    • HTML
      1000Updated Oct 15, 2025Oct 15, 2025
    • Ref-AVS

      Public
      The official repo for "Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes", ECCV 2024
      Python
      24900Updated Oct 12, 2025Oct 12, 2025
    • JavaScript
      0000Updated Sep 25, 2025Sep 25, 2025
    • The repo for "Balanced Multimodal Learning via On-the-fly Gradient Modulation", CVPR 2022 (ORAL)
      Python
      23302360Updated Sep 22, 2025Sep 22, 2025
    • MGIPF

      Public
      The repo for "MGIPF: Multi-Granularity Interest Prediction Framework for Personalized Recommendation", SIGIR 2025
      Python
      1200Updated Jul 26, 2025Jul 26, 2025
    • WCAE

      Public
      Python
      0000Updated Jul 1, 2025Jul 1, 2025
    • MS-Bot

      Public
      The offical repo for "Play to the Score: Stage-Guided Dynamic Multi-Sensory Fusion for Robotic Manipulation", CoRL 2024 (ORAL)
      Python
      31910Updated Jun 25, 2025Jun 25, 2025
    • AnyTouch

      Public
      The repo for "AnyTouch: Learning Unified Static-Dynamic Representation across Multiple Visuo-tactile Sensors", ICLR 2025
      Python
      77720Updated Jun 25, 2025Jun 25, 2025
    • Official repo for ICML 2025 paper "RollingQ: Reviving the Cooperation Dynamics in Multimodal Transformer"
      Python
      21330Updated Jun 21, 2025Jun 21, 2025
    • A python implement for Certifiable Robust Multi-modal Training
      Python
      01900Updated Jun 21, 2025Jun 21, 2025
    • [CVPR2025] Code Release of Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception
      Python
      01920Updated Jun 17, 2025Jun 17, 2025
    • The official repo for "Efficient Quantification of Multimodal Interaction at Sample Level", ICML 2025
      Python
      1710Updated Jun 5, 2025Jun 5, 2025
    • Python
      01210Updated Apr 30, 2025Apr 30, 2025
    • LFAV

      Public
      Towards Long Form Audio-visual Video Understanding
      Python
      01410Updated Apr 27, 2025Apr 27, 2025
    • The official repo for "Can Textual Semantics Mitigate Sounding Object Segmentation Preference?", ECCV 2024
      Python
      0610Updated Mar 1, 2025Mar 1, 2025
    • Python
      03640Updated Feb 23, 2025Feb 23, 2025
    • A curated list of audio-visual learning methods and datasets.
      2028110Updated Dec 3, 2024Dec 3, 2024
    • The repo for "Enhancing Multi-modal Cooperation via Sample-level Modality Valuation", CVPR 2024
      Python
      45970Updated Nov 5, 2024Nov 5, 2024
    • TSPM

      Public
      Official repository for "Boosting Audio Visual Question Answering via Key Semantic-Aware Cues" in ACM MM 2024.
      Python
      11640Updated Oct 25, 2024Oct 25, 2024
    • The repo for "KOI: Accelerating Online Imitation Learning via Hybrid Key-state Guidance", CoRL 2024
      Python
      1900Updated Oct 17, 2024Oct 17, 2024
    • The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024
      Python
      21810Updated Oct 11, 2024Oct 11, 2024
    • Official repository for "Unveiling and Mitigating Bias in Audio Visual Segmentation" in ACM MM 2024
      Python
      0600Updated Oct 10, 2024Oct 10, 2024
    • The repo for "On-the-fly Modulation for Balanced Multimodal Learning", T-PAMI 2024
      Python
      11830Updated Sep 29, 2024Sep 29, 2024
    • Python
      11830Updated Aug 21, 2024Aug 21, 2024