Hi, thank you for this excellent work!
When attempting to reproduce the results in Table 1 using the official AnySplat weights from HuggingFace on the VRNerf dataset, I encountered significantly higher LPIPS values than those reported in the paper.
| Scene |
32 view |
|
|
48 view |
|
|
64 view |
|
|
|
PSNR |
SSIM |
LPIPS |
PSNR |
SSIM |
LPIPS |
PSNR |
SSIM |
LPIPS |
| apartment |
23.62 |
0.790 |
0.231 |
26.42 |
0.847 |
0.208 |
24.17 |
0.789 |
0.241 |
| kitchen |
20.00 |
0.810 |
0.378 |
21.80 |
0.827 |
0.346 |
20.99 |
0.816 |
0.364 |
| raf_furnished room |
26.87 |
0.849 |
0.248 |
23.49 |
0.819 |
0.292 |
25.45 |
0.827 |
0.295 |
| workshop |
21.98 |
0.725 |
0.298 |
24.40 |
0.753 |
0.285 |
22.90 |
0.723 |
0.299 |
| average |
23.1175 |
0.7935 |
0.28875 |
24.0275 |
0.8115 |
0.28275 |
23.3775 |
0.78875 |
0.29975 |
I use A100 GPU and my evaluation environment, including the dependencies listed in requirements.txt, was consistent with the project specifications, and the command used was :
python src/eval_nvs.py --data_dir ./data/vrnerf/64/workshop --output_path ./output/official
I am seeking clarification on whether there might be an issue with my evaluation process or if specific parameters or preprocessing steps could account for this discrepancy in the LPIPS metric.