-
University of Tsukuba
- Tokyo
- https://kuxry.github.io/
- https://www.kaggle.com/kaikai557
- in/kuxry
Lists (6)
Sort Name ascending (A-Z)
Starred repositories
Embedding model prioritized towards Multimodal RAG, overall + VisDoc double top1 on MMEB benchmark
This repo contains the code for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks" [ICLR 2025]
Workout Video Classifier usinng CNN-LSTM Model
(CVPR 2022 Oral) Official implemention: TransRAC
[CVPR 2025] LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant
[SIGIR-AP'25] A Combination-based Framework for Generative Text–image Retrieval: Dual Identifiers and Hybrid Retrieval Strategies, SIGIR-AP 2025
Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)
A beautiful, simple, clean, and responsive Jekyll theme for academics
[SIGIR-AP'25] A Combination-based Framework for Generative Text–image Retrieval: Dual Identifiers and Hybrid Retrieval Strategies, SIGIR-AP 2025
Residual Quantization with Implicit Neural Codebooks
Order-agnostic Identifier for Large Language Model-based Generative Recommendation (SIGIR'25)
XR-Objects is an open-source prototype that anchors contextual interactions onto analog objects to not only convey information but also to initiate digital actions, such as querying LLMs for detail…
NeurIPS 2025 Spotlight; ICLR2024 Spotlight; CVPR 2024; EMNLP 2024
💡 Awesome RAG: A resource of Retrieval-Augmented Generation (RAG) for LLMs, focusing on the development of technology.
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
Official repository for CiteEval: Principle-Driven Citation Evaluation for Source Attribution
Here is the open-source code repository for the paper "Ground Every Sentence: Improving Retrieval-Augmented LLMs with Interleaved Reference-Claim Generation."
ZeroSearch: Incentivize the Search Capability of LLMs without Searching
Original implementation of SmartRAG: Jointly Learn RAG-Related Tasks From the Environment Feedback (ICLR 2025)
A modern Wine wrapper for macOS built with SwiftUI
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Official Implementation of GENIUS: A Generative Framework for Universal Multimodal Search, CVPR 2025
Official implementation of SEED-LLaMA (ICLR 2024).
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。