Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@jmdelahanty
Copy link
Contributor

Description

Imports subprocess for invoking nvidia-smi query that asks for GPU indexes, unused/available memory, and total memory on the card. Returns a dictionary that has GPU index as the key with fraction of GPU available as the value.

Types of changes

  • Bugfix
  • New feature
  • Refactor / Code style update (no logical changes)
  • Build / CI changes
  • Documentation Update
  • Other (explain)

Does this address any currently open issues?

Nope!

Outside contributors checklist

  • Review the guidelines for contributing to this repository
  • Read and sign the CLA and add yourself to the authors list
  • Make sure you are making a pull request against the develop branch (not main). Also you should start your branch off develop
  • Add tests that prove your fix is effective or that your feature works
    Are tests required for this function and, if so, what should that look like?
  • Add necessary documentation (if appropriate)
    Much redacted documentation in the function, down to write docs elsewhere if you'd like!

Thank you for contributing to SLEAP!

❤️

Add get_gpu_memory function for polling GPUs on a machine and their available vRAM.
Add newline to end of file
Add my name to Authors markdown per SLEAP outside contributor guidelines.
@codecov
Copy link

codecov bot commented Aug 13, 2022

Codecov Report

Merging #911 (0c7c955) into develop (4de5213) will decrease coverage by 0.04%.
The diff coverage is 10.00%.

@@             Coverage Diff             @@
##           develop     #911      +/-   ##
===========================================
- Coverage    67.63%   67.58%   -0.05%     
===========================================
  Files          130      130              
  Lines        22209    22226      +17     
===========================================
+ Hits         15020    15022       +2     
- Misses        7189     7204      +15     
Impacted Files Coverage Δ
sleap/nn/inference.py 79.35% <0.00%> (-0.19%) ⬇️
sleap/nn/training.py 59.95% <0.00%> (-0.23%) ⬇️
sleap/nn/system.py 43.05% <18.18%> (-4.49%) ⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@roomrys
Copy link
Contributor

roomrys commented Aug 18, 2022

TODO to make this usable:

Add a flag to sleap-train called --gpu auto that uses this function to select the GPU with lowest memory usage. It would be then called here:

sleap/sleap/nn/training.py

Lines 1932 to 1941 in 2688d56

else:
if args.first_gpu:
sleap.nn.system.use_first_gpu()
logger.info("Using the first GPU for acceleration.")
elif args.last_gpu:
sleap.nn.system.use_last_gpu()
logger.info("Using the last GPU for acceleration.")
else:
sleap.nn.system.use_gpu(args.gpu)
logger.info(f"Using GPU {args.gpu} for acceleration.")

Something like:

        if args.first_gpu:
            sleap.nn.system.use_first_gpu()
            logger.info("Using the first GPU for acceleration.")
        elif args.last_gpu:
            sleap.nn.system.use_last_gpu()
            logger.info("Using the last GPU for acceleration.")
        else:
            if args.gpu == "auto":
                gpu_ind = np.argmin(sleap.nn.system.get_gpu_memory().values())
            else:
                gpu_ind = int(args.gpu)
            sleap.nn.system.use_gpu(gpu_ind)
            logger.info(f"Using GPU {args.gpu} for acceleration.")

And then a similar setup for inference here:

sleap/sleap/nn/inference.py

Lines 4270 to 4279 in 2688d56

# Setup devices.
if args.cpu or not sleap.nn.system.is_gpu_system():
sleap.nn.system.use_cpu_only()
else:
if args.first_gpu:
sleap.nn.system.use_first_gpu()
elif args.last_gpu:
sleap.nn.system.use_last_gpu()
else:
sleap.nn.system.use_gpu(args.gpu)

Comment on lines 226 to 227
# Append percent of GPU available to GPU ID
memory_dict[gpu_id] = round(int(available_memory) / int(total_memory), 4)
Copy link
Contributor

@roomrys roomrys Aug 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want this to be a percentage?

EDIT: Do we want this to be a fraction instead of just the available_memory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know! I wasn't sure if having it as a percentage would be helpful/more readable/understandable so I just left it that way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also realize now that the comment says percentage but it's not a percent value that's given, so we can change the comment or the value. I have no preference and also don't know what best practice is lol

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as it is currently written it is a fraction no? Ratio of:

Available Memory


Total Memory

@roomrys roomrys requested a review from talmo August 24, 2022 20:50
@roomrys
Copy link
Contributor

roomrys commented Aug 24, 2022

Although we cannot test gpu features through github actions yet, I tested the new additions locally - they passed.

@jmdelahanty
Copy link
Contributor Author

Neato!

@jmdelahanty
Copy link
Contributor Author

This is so much cleaner! I don't know why I didn't think of using the list indices as the index of the card. That makes a lot more sense. Nice one Liezl!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants