Novel view synthesis (NVS) enables to generate new images of a scene or convert a set of 2D images into a comprehensive 3D model. In the context of Space Domain Awareness, since space is becoming increasingly congested, NVS can accurately map space objects and debris, improving the safety and efficiency of space operations. Similarly, in Rendezvous and Proximity Operations missions, 3D models can provide details about a target object's shape, size, and orientation, allowing for better planning and prediction of the target's behavior.
In this work, we explore the generalization abilities of these reconstruction techniques, aiming to avoid the necessity of retraining for each new scene, by presenting a novel approach to 3D spacecraft reconstruction from single-view images, DreamSat, by fine-tuning the Zero123 XL, a state-of-the-art single-view reconstruction model, on a high-quality dataset of 190 high-quality spacecraft models and integrating it into the DreamGaussian framework.
We demonstrate consistent improvements in reconstruction quality across multiple metrics, including Contrastive Language-Image Pretraining (CLIP) score (+0.33%), Peak Signal-to-Noise Ratio (PSNR) (+2.53%), Structural Similarity Index (SSIM) (+2.38%), and Learned Perceptual Image Patch Similarity (LPIPS) (+0.16%) on a test set of 30 previously unseen spacecraft images. Our method addresses the lack of domain-specific 3D reconstruction tools in the space industry by leveraging state-of-the-art diffusion models and 3D Gaussian splatting techniques. This approach maintains the efficiency of the DreamGaussian framework while enhancing the accuracy and detail of spacecraft reconstructions.
| Spacecraft | Input | Generated Novel Views |
|---|---|---|
| Explorer 1 | ||
| Apollo Lunar Module | ||
| Space Launch System Block 1 |
For data, 190 spacecraft 3D models from National Aeronautics and Space Administration (NASA), European Space Agency (ESA), and Synthetic Dataset for Satellites (SPE3R) datasets were used.
To process the data, change the data directories in dataset_toolkit/zero123_subprocess.py and run the file. This can be done in a separate terminal window and run in background without disrupting other processes. Once the data has been processed and added to the new data directory, run dataset_toolkit/json_to_npy.py to convert the json file with camera transform matrices to a numpy file for each image instead. The final dataset should contain folders, each of which contains 48 views of a 3D object with the corresponding numpy matrix. This matches the format required by Zero123-XL (https://github.com/cvlab-columbia/zero123).
Copy the data over to the GPU server. Clone the zero123 repository and upload the Zero123XL checkpoint. Make sure to update simple.py file to adjust train/validation split. Detailed instructions on changes made to the repository are included in changes.md.
Run the finetuning (download missing packages if prompted) and save the finetuned model
python main.py
-t
--base configs/sd-objaverse-finetune-c_concat-256.yaml
--gpus 0,1,2,3,4
--scale_lr False
--num_nodes 1
--seed 42
--check_val_every_n_epoch 10
--finetune_from zero123-xl.ckpt
If there are GPU memory issues during finetuning, try looking for fit_loop.py in the pytorch-lightning module and adding torch.cuda.empty_cache() in the function current_epoch.
Clone the DreamGaussian repository (https://github.com/dreamgaussian/dreamgaussian). Then, upload checkpoint and config files and adjust model paths in main.py, main2.py, and any other files needed to use the finetuned model instead of the original Zero123XL
CLIP similarity can be calculated through running python -m kiui.cli.clip_sim example_rgba.png example.obj. The script to calculate LPIPS, PSNR, SSIM is provided in this repository.
If you find this project research useful, please cite our work:
@inproceedings{dreamsat,
author = {Mathihalli, Nidhi and Wei, Audrey and Lavezzi, Giovanni and Mun Siew, Peng and Rodriguez-Fernandez, Victor and Urrutxua, Hodei and Linares, Richard},
year = {2024},
month = {10},
pages = {},
booktitle = {75th International Astronautical Congress 2024},
publisher = {International Astronautical Federation},
address = {Milan, Italy},
title = {DreamSat: Towards a General 3D Model for Novel View Synthesis of Space Objects}
}
Link to the paper: arXiv | ResearchGate
Research was sponsored by the Department of the Air Force Artificial Intelligence Accelerator and was accomplished under Cooperative Agreement Number FA8750-19-2-1000. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Department of the Air Force or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein.
The authors acknowledge the MIT SuperCloud for providing HPC resources that have contributed to the research results reported within this paper.
H.U. wishes to acknowledge support through the research grant TED2021-132099B-C32 funded by MCIN/AEI/10.13039/501100011033 and the ``European Union NextGenerationEU/PRTR''.