LIULINKAI

KAYLK LIULINKAI

1 follower · 1 following

Stars

XMUDeepLIT / LLaVE

LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning

Python 70 2 Updated May 23, 2025

raghavlite / B3

Python 30 Updated Oct 28, 2025

zhengxuJosh / Awesome-RAG-Vision

Awesome-RAG-Vision: a curated list of advanced retrieval augmented generation (RAG) for Computer Vision

250 7 Updated Oct 11, 2025

seilk / VisAttnSink

[ICLR 2025] See What You Are Told: Visual Attention Sink in Large Multimodal Models

Python 65 7 Updated Feb 16, 2025

TIGER-AI-Lab / VLM2Vec

This repo contains the code for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks" [ICLR 2025]

Python 450 42 Updated Oct 24, 2025

deepglint / UniME

[ACM MM25] The official code of "Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs"

Python 94 5 Updated Aug 8, 2025

louisnino / RLcode

Python 1,027 304 Updated Jan 29, 2023

haokunwen / Awesome-Composed-Image-Retrieval

Collection of Composed Image Retrieval (CIR) papers.

269 18 Updated Aug 18, 2025

yu-rp / VisualPerceptionToken

Python 125 2 Updated Mar 22, 2025

zai-org / GLM-V

GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Python 1,720 102 Updated Oct 28, 2025

yuchen2199 / Explainable-Driver-Attention-Prediction

[ICCV2025] Where, What, Why: Towards Explainable Driver Attention Prediction

Python 36 1 Updated Oct 27, 2025

EvolvingLMMs-Lab / multimodal-search-r1

MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search tools.

Python 342 17 Updated Aug 26, 2025

hyp1231 / awesome-generative-recommendation

Awesome things about generative recommendation models.

98 3 Updated Apr 28, 2025

JiuhaiChen / BLIP3o

Official implementation of BLIP3o-Series

Python 1,563 69 Updated Oct 27, 2025

Theia-4869 / CDPruner

[NeurIPS 2025] Official code for paper: Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs.

Python 65 3 Updated Sep 20, 2025

TIGER-AI-Lab / UniIR

Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)

Python 166 16 Updated Oct 1, 2024

LightChen233 / M3CoT

Python 84 3 Updated Jun 7, 2024

PKU-ICST-MIPL / DyFo_CVPR2025

Python 93 4 Updated Aug 14, 2025

jingyi0000 / VLM_survey

Collection of AWESOME vision-language models for vision tasks

2,983 221 Updated Oct 14, 2025

MA-Wenhui / skyrover

SkyRover, a modular and extensible simulator tailored for cross-domain pathfinding research.

Python 9 2 Updated Mar 4, 2025

jungao1106 / ICoT

[CVPR' 25] Interleaved-Modal Chain-of-Thought

Python 90 4 Updated Oct 23, 2025

robotics-upo / marsupial_simulator_ros2

Physical simulation of Marsupial UAV-UGV Systems Connected by a Variable-Length Hanging Tether

Python 32 7 Updated Aug 3, 2025

apple / ml-aim

This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.

Python 1,379 65 Updated Aug 4, 2025

Hon-Wong / VoRA

[Fully open] [Encoder-free MLLM] Vision as LoRA

Python 341 29 Updated Jun 12, 2025

BradyFU / Awesome-Multimodal-Large-Language-Models

✨✨Latest Advances on Multimodal Large Language Models

16,575 1,069 Updated Oct 30, 2025

leftthomas / ACRNet

A PyTorch implementation of ACRNet based on ICME 2023 paper "Weakly-supervised Temporal Action Localization with Adaptive Clustering and Refining Network"

Python 15 1 Updated Aug 29, 2023

wangjiangshan0725 / RF-Solver-Edit

[🚀ICML 2025] "Taming Rectified Flow for Inversion and Editing" Using FLUX and HunyuanVideo for image and video editing!

Python 592 15 Updated May 1, 2025

alibaba / Tora

[CVPR'25]Tora: Trajectory-oriented Diffusion Transformer for Video Generation

Python 1,208 56 Updated Jul 9, 2025

Wan-Video / Wan2.1

Wan: Open and Advanced Large-Scale Video Generative Models

Python 14,583 2,094 Updated Jul 17, 2025

JD-GenX / CAIG

[WWW 2025] Official PyTorch Code for "CTR-Driven Advertising Image Generation with Multimodal Large Language Models"

Python 58 4 Updated Aug 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly