Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@joaander
Copy link
Member

@joaander joaander commented Apr 14, 2025

Description

Allow users to request memory (in MB) specific to each action. Such a request is redundant when the partition configuration already includes it. Row warns the user when they request less than they could and errors when they request more. On partitions with no explicit memory request, the action-specific request will be passed through.

Motivation and context

Some clusters allow users to request any amount of memory, therefore a single default value is not appropriate.

Resolves #84.

How has this been tested?

  • New unit tests pass.
  • Validate Anvil
  • Validate Great Lakes
  • Validate Delta
  • Checked the HTML doc formatting.

Checklist:

  • I have reviewed the Contributor Guidelines.
  • I agree with the terms of the Row Contributor Agreement.
  • My name is on the list of contributors (doc/src/contributors.md) in the pull request source branch.
  • I have added a change log entry to doc/src/release-notes.md.

To prepare to allow actions to request memory:

Store the memory request as an integer number of megabytes. The
scheduler can then validate the action's request compared to the
cluster's request (if any) and warn the user appropriately.
Pass through the memory request when there is no request defined in the
partition.

When there is a request in the partition, warn the user if they request
less and error when they request more.
Anvil rewrites user requests to claim more CPUs when the memory goes
above the allowed value per core. Previously, we avoided setting this
because the number changes. Now, with the ability to set
`memory_per_cpu_mb`, we need to set the memory level to prevent users
from accidentally submitting invalid jobs.
@joaander
Copy link
Member Author

@bcrawford39GT, let me know if this will meet your needs.

This previously worked, but no longer seems to.
@bcrawford39GT
Copy link

bcrawford39GT commented Apr 16, 2025

@joaander Hey!

Thanks for adding this! This looks OK to me in general. Thinking about other users also, it may may it easier them to do the following.

  • Maybe change to MB to GB and/or just allow a string input so they can select either

Thoughts?

@joaander
Copy link
Member Author

The numerical value is needed to ensure that the user does not request more resources than are available on clusters that do charge extra for memory. GB is not possible because I need MB resolution on some clusters (for example, 1970M on Anvil). Writing a validator and unit conversion code that can process arbitrary strings is not a valuable use of developer time.

If you want to pass a string through to SLURM, then you can use submit_options.<cluster_name>.custom. That option was available before this PR.

@bcrawford39GT
Copy link

@joaander OK. That makes sense. I think this looks OK then

@joaander
Copy link
Member Author

Thanks for your review. I'll release a new version soon.

@joaander joaander merged commit a73693d into trunk Apr 16, 2025
17 checks passed
@joaander joaander deleted the action-memory branch April 16, 2025 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add 'memory_per_cpu' and 'memory_per_gpu' variables/options to the workflow.toml file

3 participants