Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View duyngtr16061999's full-sized avatar
  • Nanyang Technological University
  • Ho Chi Minh city

Block or report duyngtr16061999

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[ECCV’24 Main] MAMA: A Meta-optimized Angular Margin Contrastive Framework for Video-Language Representation Learning

Python 8 Updated Oct 16, 2024

[EMNLP’24 Main] Encoding and Controlling Global Semantics for Long-form Video Question Answering

Python 18 Updated Oct 9, 2024

[ACL’24 Findings] Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives

46 Updated Jul 9, 2025

[AAAI 2024] MotionMix: Weakly-Supervised Diffusion for Controllable Motion Generation

Python 34 1 Updated Mar 1, 2024

[NAACL 2024] ToXCL: A Unified Framework for Toxic Speech Detection and Explanation

Python 11 3 Updated Aug 29, 2024

[AAAI’24 Main] READ: Recurrent Adapter with Partial Video-Language Alignment for Parameter-Efficient Transfer Learning in Low-Resource Video-Language Modeling

Python 10 1 Updated Jan 24, 2025

✨✨Latest Advances on Multimodal Large Language Models

17,056 1,098 Updated Dec 23, 2025

Large Language Models Are Reasoning Teachers (ACL 2023)

Jupyter Notebook 343 22 Updated Mar 7, 2025

Inference Llama 2 in one file of pure 🔥

Mojo 2,115 136 Updated Nov 30, 2025

A Topic Modeling System Toolkit (ACL 2024 Demo)

Jupyter Notebook 279 26 Updated Oct 14, 2025

Official codebase for ICLR oral paper Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling

Python 36 3 Updated Apr 14, 2022

Collection of AWESOME vision-language models for vision tasks

3,040 229 Updated Oct 14, 2025
7 Updated Jun 23, 2023
Jupyter Notebook 49 11 Updated Oct 17, 2023
Python 7 Updated Jun 2, 2023

Benchmarking Panoptic Scene Graph Generation (PSG), ECCV'22

Python 465 71 Updated Apr 10, 2023

Official implementation of POODLE: Improving Few-shot Learning via Penalizing Out-of-Distribution Samples (NeurIPS 2021)

Python 14 1 Updated Aug 6, 2022

Network Pruning That Matters: A Case Study on Retraining Variants (ICLR 2021)

Python 17 Updated Sep 19, 2021
Python 31 4 Updated Jul 10, 2023

Code for the ICML 2021 (long talk) paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"

Python 1,516 227 Updated Apr 3, 2024

A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Commonsense Reasoning and related area.

671 94 Updated Jul 6, 2023

Code for ICML 2020 "Graph Optimal Transport for Cross-Domain Alignment"

Python 159 26 Updated Jul 24, 2020

project page for VinVL

359 25 Updated Jul 26, 2023

A python toolkit for parsing captions (in natural language) into scene graphs (as symbolic representations).

Python 593 55 Updated Jan 23, 2024

Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"

Python 799 114 Updated Jun 30, 2021

Vision-Language Pre-training for Image Captioning and Question Answering

Python 424 59 Updated Jan 18, 2022

Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"

Python 539 102 Updated May 1, 2023

Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".

Jupyter Notebook 746 112 Updated May 22, 2023
Next