Thanks to visit codestin.com
Credit goes to github.com

Skip to content

drmaa errors- resubmit/retry #116

@cchng

Description

@cchng

Hi ruffus team,

I'm using the drmaa wrapper to submit/run jobs on an SGE cluster. I'm running into communication exceptions that I've been working to resolve (Related issue: aws/aws-parallelcluster#1592). Has the ruffus team encountered this error? If not, is there a resubmit/retry feature that is ready to use? Even though not explicitly documented, it looks like the run_job function takes a resubmit parameter.

[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] File "/shared/amgenesis/helpers.py", line 126, in run
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] cmdline.run (options, logger=logger_proxy, multithread = options.jobs, exceptions_terminate_immediately = True)
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] File "/home/ec2-user/anaconda3/lib/python3.7/site-packages/ruffus/cmdline.py", line 834, in run
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] **appropriate_options)
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] File "/home/ec2-user/anaconda3/lib/python3.7/site-packages/ruffus/task.py", line 5424, in pipeline_run
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] raise job_errors
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] ruffus.ruffus_exceptions.RethrownJobError:
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] Original exception:
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] Exception #1
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] 'drmaa.errors.DrmCommunicationException(code 2: failed receiving gdi request response for mid=65535 (can't send response for this message id - protocol error).)' raised in ...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions