A job manager for Slurm, written in Go.
Requirements:
- Go >= 1.18
- GNU Make
- Slurm commands on the host that will run Slurmer and be connected to the cluster
First, you have to compile the code to get both slurmer and executor executables. A Makefile is available to do so:
makeThere is few variables that you can override to specify custom options:
| Variable name | Description | Default |
|---|---|---|
| MIN_UID, MAX_UID | the minimum and maximum uids that the executor can spoof | 1000, 65565 |
| SLURMER_UID | The user that is authorized to use the executor | 1000 |
| PREFIX | The path prefix to install slurmer to | /usr/local |
Don't forget to make clean between variable changes to make sure that your changes have taken effect.
make MIN_UID=1100 MAX_UID=1200 SLURMER_UID=1099Then you can install the executables to the destination directory (need root permissions):
make PREFIX=/opt/slurmer installNow Slurmer should be installed on your system. Next we'll see how to use it.
For this section, let's consider that slurmer is installed to prefix /usr/local/. Let's say that we have the configuration in /usr/local/etc/slurmer/config.yml (see sample configuration) and the Slurmer files have to be in /var/lib/slurmer. Make sure that only Slurmer account can edit config.yml.
First, create a directory /var/lib/slurmer and set the owner to the Slurmer user (we assume that slurmer is the Slurmer user):
mkdir /var/lib/slurmer
chown -R slurmer /var/lib/slurmer/ /usr/local/etc/slurmer/
# make sure that only slurmer can access and modify these files
chmod 0700 /var/lib/slurmer/ /usr/local/etc/slurmer/
chmod 0600 /usr/local/etc/slurmer/config.ymlNext, set the value working_dir in config.yml to /var/lib/slurmer.
Now you can run Slurmer with the following line to consider the given configuration (run with the Slurmer user):
slurmer -c /usr/local/etc/slurmer/config.ymlA sample config file is also available in configs/config-example.yml.
# Useful when 'connector' is set to 'slurmrest'; not implemented yet
slurmrest:
url: "http+unix:///default?socket=/tmp/slurmrestd.sock"
slurmer:
ip: 127.0.0.1
port: 8080
connector: slurmcli
working_dir: /var/lib/slurmer
templates_dir: /etc/slurmer/templates
executor_path: /usr/bin/executor
applications:
- name: app1
token: averycomplicatedchallenge
uuid: a9a5fc66-9bed-4a13-874a-d8d7d1756224 # a random app uuid
logs:
# available: text, json
format: text
stdout: false
output: /var/logs/slurmer.log
# available, from most to less verbose:
# trace, debug, info, warning, error, fatal, panic
level: debug
# To enable OIDC
oidc:
enabled: false
issuer: ""
audience: ""Each request must be authenticated with an application token and eventually a user for jobs related resources.
| Header | Description |
|---|---|
| X-Auth-Token | Authenticates an application |
| User | The user that owns a job |
| Verb | Route | Description |
|---|---|---|
| GET | /apps | list all applications, need admin access token |
| POST | /apps | add a new application, need admin access token |
| GET | /apps/{appId} | show {appId} application, need admin access token |
| GET | /apps/{appId}/jobs | list all jobs of the app {appId} |
| POST | /apps/{appId}/jobs | add a new job to app {appId} |
| GET | /apps/{appId}/jobs/{jobId} | show job {jobId} details |
| DELETE | /apps/{appId}/jobs/{jobId} | delete a registered job |
| GET | /apps/{appId}/jobs/{jobId}/batch | read jobs batch file |
| GET | /apps/{appId}/jobs/{jobId}/out | read jobs standard output |
| GET | /apps/{appId}/jobs/{jobId}/files | download the job files as zip |
| POST | /apps/{appId}/jobs/{jobId}/files | upload a zip to job directory |
| PUT | /apps/{appId}/jobs/{jobId}/status | start or stop a job |
| PUT | /apps/{appId}/jobs/{jobId}/prune | clear all job data except batch file |
| PUT | /apps/{appId}/jobs/{jobId}/start | start a job |
| PUT | /apps/{appId}/jobs/{jobId}/stop | stop a job |