Thanks to visit codestin.com
Credit goes to github.com

Skip to content

YiguoHe/RSM-ITD

Repository files navigation

Rethinking Remote Sensing CLIP: Leveraging Multimodal Large Language Models for High-Quality Vision-Language Dataset

Welcome to the official repository of our paper "Rethinking Remote Sensing CLIP: Leveraging Multimodal Large Language Models for High-Quality Vision-Language Dataset"! This paper has been accepted at ICONIP 2024, and we will upload the Arxiv version of the paper soon. image

RSM-ITD (Remote Sensing Multisource Image-Text Dataset) is a high-quality paired dataset of remote sensing images and captions. RSM-CLIP is the fully fine-tuned CLIP model based on RSM-ITD. RSM-CLIP surpasses previous models on various downstream tasks while using much less data compared to other similar models, proving the high quality of RSM-ITD.

We are providing our training data and state-of-the-art models here to promote research progress in the community.

1. RSM-ITD Dataset

The file RSMITD.csv provides the captions for the dataset.

You can download our dataset images via Baidu Netdisk: Link:https://pan.baidu.com/s/1lJ7EDerxeNxlTb9FyiPQnA?pwd=hkne Code:hkne

Figure: A demo of our Dataset

demo

2. Usage of the RSM-CLIP Model

Installation Instructions

Our model is based on openCLIP, so please install the necessary dependencies for openCLIP before using the model. You can find the instructions here: openCLIP GitHub repository.

Additionally, if you want to use our cross-modal retrieval testing script (RET3_RetrieveTest.py) for benchmarking or reproducing state-of-the-art (SOTA) results, please install the required dependencies mentioned in the script file. Specifically, you need to install clip_benchmark via:

pip install clip_benchmark

You will also need to download the following datasets for testing:

Testing Commands

To test the model, you can use the following command:

/path/to/python /path/to/your_project/your_script.py \
--model-name "ViT-B-32" \
--retrieval-images-dir "/path/to/images" \
--retrieval-json-dir "/path/to/dataset.json" \
--RSM_CLIP_path "/path/to/RS-CLIP-models/RSM-CLIP-ViT-B-32.pt"

3. License

This project is licensed under the Apache 2.0 license. See the LICENSE file for more details.

About

This is the repository of RSM-ITD and RSM-CLIP.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages