-
Notifications
You must be signed in to change notification settings - Fork 624
Beta: Execute ghe-restore tasks in parallel #601
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5a5b659 to
bde8ae9
Compare
6a47bf3 to
d263c8a
Compare
775123b to
21c22e3
Compare
495c243 to
71b4e67
Compare
21c22e3 to
4a49554
Compare
71b4e67 to
01cabcd
Compare
4a49554 to
fa01a30
Compare
01cabcd to
53912d8
Compare
fa01a30 to
1eea224
Compare
53912d8 to
c8eccf8
Compare
2c89a01 to
0883e2a
Compare
0883e2a to
dcee3b0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| ghe-ssh "$GHE_HOSTNAME" -- /bin/bash 1>&3 | ||
| done | ||
|
|
||
| ghe-ssh "$GHE_HOSTNAME" -- "sudo sh -c 'rm $GHE_REMOTE_DATA_USER_DIR/elasticsearch-restore/*.gz'" 1>&3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When both ghe-restore-es-audit-log and ghe-restore-es-hookshot run at the same time during parallel restore, there is a race condition whereupon ghe-ssh "$GHE_HOSTNAME" -- "sudo sh -c 'rm $GHE_REMOTE_DATA_USER_DIR/elasticsearch-restore/*.gz'" 1>&3 from one script deletes the *.gz files that the other script had put down. I moved these files to separate directories so this wouldn't happen and my fix for the single-vm or configured edge case is to copy all files to the elasticsearch-restore folder so elasticsearch-post-start will process them accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍
reverting changes to redis restart mechanism
Execute long running restore tasks in parallel by leveraging moreutils parallel.
This resolves https://github.com/github/ghes-infrastructure/issues/407
Most of this work is based upon work done in #597