Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[Process] add a process manager allowing to run commands in parallel (queueing up if needed) #8454

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gggeek opened this issue Jul 9, 2013 · 24 comments

Comments

@gggeek
Copy link

gggeek commented Jul 9, 2013

A common pattern when writing batch scripts importing huge number of existing data into an app is to write the batch script so that it can execute its work in parallel. Running the script with eg. 8 instances in parallel allows to finish the import task in a fraction of the time.
Depending on the platform in use, the developer might use forking, threading, pnctl and a myriad other techniques to achieve parallelism. There's no rocket science in that code, but it still quite a chore and bug-prone.

I suggest that the Sf Process component gets extended with a process manager class, which can be used to execute multiple processes in parallel.

Sample code - not based on Sf currently - is available at https://gist.github.com/gggeek/5956177 for discussion

@gggeek
Copy link
Author

gggeek commented Jul 9, 2013

Note (more details on the rationale): to achieve parallelism some developers rely on a master-script using curl to execute worker-scripts via http calls to localhost. This is imho bad practice as it forces the php.ini used for web purpose to have exceedingly high memory/timeout issues. It also keeps web worker processes occupied for a very long time

@garak
Copy link
Contributor

garak commented Oct 5, 2014

👍

2 similar comments
@keradus
Copy link
Member

keradus commented Jan 6, 2015

👍

@halk
Copy link

halk commented Oct 16, 2015

👍

@gggeek
Copy link
Author

gggeek commented Oct 16, 2015

Well, time to get back from the grave I guess, 3 plusses mean that I have to turn this into a full-fledged Pull Request...

@javiereguiluz
Copy link
Member

@gggeek maybe you should wait a bit more. Votes from the community are important, but the decisive votes are only the ones from the Symfony Core. So far none of them have voted on this.

@gggeek
Copy link
Author

gggeek commented Oct 16, 2015

@javiereguiluz ok then

@xabbuh
Copy link
Member

xabbuh commented Oct 20, 2015

I think that's a good idea. Having such a manager would allow to get rid of a lot of boilerplate code you would have to write in your own application otherwise.

@mvrhov
Copy link

mvrhov commented Oct 22, 2015

You mean this: https://github.com/kriswallsmith/spork ?

@stof
Copy link
Member

stof commented Oct 22, 2015

@mvrhov this class is not about managing Process instances

@aitboudad
Copy link
Contributor

maybe this https://github.com/liuggio/fastest/tree/master/src/Process will help you?

@gggeek
Copy link
Author

gggeek commented Oct 22, 2015

About spork: it seems that it can indeed be used to achieve the same goal, or almost.
Not documented clearly in its readme, but in this post: https://blog.vandenbrand.org/2013/03/17/speed-up-your-data-migration-with-spork/

@fesor
Copy link

fesor commented Oct 22, 2015

spork uses fork model (which means that only PHP processes can be parallelized), while symfony/process can achieve the pretty same goals with just proc_open. Also it can be used only in unix-based environments.

I currently working on PoC implementation of process manager for symfony/process with some IPC support via pipes (to add concurrency to Behat).

@keradus
Copy link
Member

keradus commented Oct 22, 2015

It would be great to have multiple strategies to achieve the goal. Like using pthread, fork, pipe, msg, whatever is available on env. And if there aren't any - fallback to running things in sequence, using same interface.

@fesor
Copy link

fesor commented Oct 22, 2015

@keradus the thing is that pthreads, fork, proc_open are for different goals. For example pthreads + coroutine could be used to achieve micro-threads functionality, fork can be used for speed-up workers startup and proc_open for anything else.

Currently there is no way to use symfony/process with long-lived processes. See #14482 for details. After process is started the only way to pass messages to it is selectable streams (php://memory isn't selectable so passing messages to processes will cause additional I/O overhead since we should pass this messages via file system).

I think that what should be done, is simple process manager for allowing to start multiple processes via proc_open. This will cover most of use cases and doesn't require any additional extensions to be used (also this is the only cross platform way to do this).

For example I have this use cases:
I work with some webservice which should generate images via wkhtmltoimage. For one request it generates 10 images. Right now this operation takes ~5 seconds. I could start several instances of wkhtmltoimage in parallel in speed up things. This is possible only via proc_open. You can't do this with just exec, or fork or via pthreads... well you can do it via pthreads but this won't be efficient.

@keradus
Copy link
Member

keradus commented Oct 22, 2015

It is tricky, but workers+jobs strategy may be done with threads and processes too.
But I agree, there are big differences, so creating an layer to make them work in similar way is tricky

@fesor
Copy link

fesor commented Oct 22, 2015

creating an layer to make them work in similar way is tricky

I think that this doesn't bring much benefits. proc_open covers most of use cases + it available out-of-box for both unix and windows. But process manager totally should be implemented (or process pool).

@nicolas-grekas
Copy link
Member

See #18513, finally

nicolas-grekas added a commit that referenced this issue Apr 13, 2016
…SKIP_OUT/ERR modes (nicolas-grekas)

This PR was merged into the 3.1-dev branch.

Discussion
----------

[Process] Turn getIterator() args to flags & add ITER_SKIP_OUT/ERR modes

| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | #8454
| License       | MIT
| Doc PR        | -

Targeted at 3.1

Commits
-------

428f12e [Process] Turn getIterator() args to flags & add ITER_SKIP_OUT/ERR modes
@martinsik
Copy link

Since this was scheduled for 3.1 which is out already and PR #18776 mentiones it does it mean this issue is resolved?

@gggeek
Copy link
Author

gggeek commented Oct 22, 2016

@martinsik tbh I am not sure. I looked at #18513, and all I can see is changes in the way process IO is handled, which allows to set up chained processes.
It might be beneficial to the development of a Process Manager, but it does not seem that it includes one.

Otoh I have found this extremely simple Process Manager that seems to fit the bill for my original usecase: https://github.com/jagandecapri/symfony-parallel-process

@Simperfit
Copy link
Contributor

@gggeek Do you want to provide a PR for this ?

@nicolas-grekas
Copy link
Member

To anyone wondering about this request, don't miss the comments in #23596

@carsonbot
Copy link

Thank you for this suggestion.
There has not been a lot of activity here for a while. Would you still like to see this feature?

@chalasr
Copy link
Member

chalasr commented Dec 20, 2020

I'm going to close this issue as it has no activity for 3 years now.
Feel free to reopen or submit a PR if you can.
Thanks

@chalasr chalasr closed this as completed Dec 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests