Fix: Models always loaded on "cuda:0" when working inside subprocesses on multi-GPU setups #1230
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
When using and importing torch and SAHI inside a multiprocessing subprocess in a multi-GPU environment, sometimes
only a single GPU is visible thus torch.cuda.device_count() returns 1. In that case I was having problems loading the model to the correct device. SAHI performs an extra verification: it checks that the requested CUDA device index is < total CUDA devices.
In a subprocess that only sees one GPU (even if the system has multiple GPUs), torch.cuda.device_count() returns 1.
If the code tries to load on "cuda:1" or higher, the verification fails, and SAHI falls back to cuda:0 (the default).
In line 88 of utils/torch_utils.py:
I changed it to remove that extra-verification that prevented loading the model to the right GPU:
Key change:
Skip the global check against torch.cuda.device_count().
Trust the device string/int provided by the user ("cuda:0", 0, "cuda:1", etc.).
This makes SAHI compatible with the standard multi-GPU pattern where each worker process is pinned to a GPU using CUDA_VISIBLE_DEVICES.