lvlm
Here are 29 public repositories matching this topic...
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
-
Updated
Apr 4, 2025 - HTML
OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images.
-
Updated
Jun 1, 2025 - Jupyter Notebook
up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources
-
Updated
Oct 3, 2025
[NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"
-
Updated
Sep 26, 2024 - Python
📜 Paper list on decoding methods for LLMs and LVLMs
-
Updated
Nov 7, 2025
[ICCV'25] The official code of paper "Combining Similarity and Importance for Video Token Reduction on Large Visual Language Models"
-
Updated
Nov 24, 2025 - Python
[ICCV 2025] HQ-CLIP: Leveraging Large Vision-Language Models to Create High-Quality Image-Text Datasets
-
Updated
Aug 6, 2025
CLIP-MoE: Mixture of Experts for CLIP
-
Updated
Oct 10, 2024 - Python
Latest Advances on (RL based) Multimodal Reasoning and Generation in Multimodal Large Language Models
-
Updated
Oct 30, 2025
[AAAI 2025] HiRED strategically drops visual tokens in the image encoding stage to improve inference efficiency for High-Resolution Vision-Language Models (e.g., LLaVA-Next) under a fixed token budget.
-
Updated
Apr 18, 2025 - Python
The official repository of "SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World".
-
Updated
Aug 20, 2025
LEMMA: An effective and explainable way to detect multimodal misinformation with LVLM and external knowledge augmentation, incorporating the intuition and reasoning capbility inside LVLM.
-
Updated
Jun 4, 2025 - Jupyter Notebook
Code for ICLR 2025 Paper: Visual Description Grounding Reduces Hallucinations and Boosts Reasoning in LVLMs
-
Updated
May 7, 2025 - Python
A benchmark dataset and simple code examples for measuring the perception and reasoning of multi-sensor Vision Language models.
-
Updated
Dec 27, 2024 - Python
[CVPR'25] Antidote: A Unified Framework for Mitigating LVLM Hallucinations in Counterfactual Presupposition and Object Perception
-
Updated
Oct 11, 2025 - Python
đź“–Curated list about reasoning abilitiy of MLLM, including OpenAI o1, OpenAI o3-mini, and Slow-Thinking.
-
Updated
Feb 7, 2025
[NeurIPS 2025] Intervene-All-Paths: Unified Mitigation of LVLM Hallucinations across Alignment Formats
-
Updated
Dec 5, 2025 - Python
Improve this page
Add a description, image, and links to the lvlm topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the lvlm topic, visit your repo's landing page and select "manage topics."