Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Clustermq multiprocess for main jobs and ssh for worker jobs #198

@mattwarkentin

Description

@mattwarkentin

Hi Will,

I haven't thought this through well enough to really assess its feasibility, but I wanted to scribble my thoughts down and get your input. Not sure if we need to loop @mschubert in or not.

As you know, for one of my current projects I am using the "mostly local, sometimes remote" approach - my project lives on my local machine, but some computationally intensive tasks are selectively sent to the HPC via SSH thanks to clustermq. This works great.

However, when using options(clustermq.scheduler = "ssh"), you have only two options, run jobs locally and sequentially in the "main" process, or send the job via ssh. The majority of the tasks run in the "main" R process and are forced to run sequentially, all for the ability to send a few select jobs to HPC.

So long story short, I am wondering if would somehow be possible to use "multiprocess" for jobs with deployment = "main" and "ssh" for targets with deployment = "worker". I know this is a convoluted use-case, but I am actually constrained to using this workflow for this particular project and was just wondering if something like that could possibly work.

Reasons why I don't just run everything via ssh:

  1. Some of the tasks are trivial and quick, and the overhead of sending them to the HPC over sockets is unnecessary

  2. Some of the targets rely on the local NFS for access to files which cannot be moved to the cluster or cloud

Reasons why I don't just run everything locally:

  1. Memory constraints on my local machine

  2. There are fewer computationally intensive tasks, but they take days to run sometimes

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions