Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View sarasarto's full-sized avatar
  • AImageLab - University of Modena and Reggio Emilia
  • Modena

Block or report sarasarto

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Recurrence Meets Transformers for Universal Multimodal Retrieval

Python 8 Updated Sep 12, 2025

[BMVC 2025] Mitigating Hallucinations in Multimodal LLMs via Object-aware Preference Optimization

Python 5 Updated Sep 2, 2025

arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv

Python 6,519 374 Updated Jun 2, 2025

[CVPR 2025] Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval

Python 27 1 Updated Sep 12, 2025

This repository contains a curated list of research papers and resources focusing on saliency and scanpath prediction, human attention, human visual search.

58 3 Updated May 9, 2025

[ECCV 2024] BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues.

13 Updated Jul 17, 2024

[CVPR 2025] Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering

Python 51 Updated Jul 14, 2025

[BMVC 2024 Oral ✨] Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization

Python 18 1 Updated Sep 11, 2024

[ICCVW 25] LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction Tuning

Python 154 8 Updated Aug 8, 2025

Pytorch code for ECCVW 2022 paper "Consistency-based Self-supervised Learning for Temporal Anomaly Localization"

Python 14 2 Updated Jul 9, 2024