Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@rubiojr
Copy link
Member

@rubiojr rubiojr commented Jun 21, 2016

We now rsync once per cluster node available instead of rsyncing each
Git repository individually.

Some simple benchmarks restoring a snapshot with 1183 repositories (13 GiB):

  • Using backup-utils 2.6.1
real    20m34.923s
user    5m1.888s
sys 2m39.983s

  • Using the new implementation
real    9m0.368s
user    2m46.912s
sys 1m18.746s

The old implementation is able to restore ~1 repo/s so restoring backup
snapshots with a large number of repositories and a fast network will benefit
the most from this.

Here's the time it takes to restore 8K repositories (~800 MiB), all of them
very similar in size (100K, with a single README file added):

  • Using backup-utils 2.6.1
real    111m45.370s
user    7m54.829s
sys 6m38.247s
  • Using the new implementation
real    6m20.087s
user    0m16.509s
sys 1m42.616s

In clusters with more than 3 Git server nodes, backup-utils 2.6.1 also
restores the repositories to all the Git server nodes available. Only three
copies of a Git repository are necessary so this patch also fixes that,
speeding things up and optimizing disk usage.

/cc @github/backup-utils

We now rsync once per cluster node available instead of rsyncing each
Git repository individually.

Some simple benchmarks restoring a snapshot with 1183 repositories (13 GiB):

* Using backup-utils 2.6.1

```
real    20m34.923s
user    5m1.888s
sys 2m39.983s

```

* Using the new implementation

```
real    9m0.368s
user    2m46.912s
sys 1m18.746s
```

The old implementation is able to restore ~1 repo/s so restoring backup
snapshots with a large number of repositories and a fast network will benefit
the most from this.

Here's the time it takes to restore 8K repositories (~800 MiB), all of them
very similar in size (100K, with a single README file added):

* Using backup-utils 2.6.1

```
real    111m45.370s
user    7m54.829s
sys 6m38.247s
```

* Using the new implementation

```
real    6m20.087s
user    0m16.509s
sys 1m42.616s
```

In clusters with more than 3 Git server nodes, backup-utils 2.6.1 also
restores the repositories to all the Git server nodes available. Only three
copies of a Git repository are necessary so this patch also fixes that,
speeding things up and optimizing disk usage.

/cc @github/backup-utils
@rubiojr rubiojr merged commit 1deb1da into master Jun 21, 2016
@rubiojr rubiojr deleted the repos-restore-speedup branch June 21, 2016 20:50
@rubiojr
Copy link
Member Author

rubiojr commented Jun 22, 2016

Forgot to mention that this needs GitHub Enterprise 2.6.4, otherwise it'll automatically use the old (slower) restore code.

@rubiojr rubiojr mentioned this pull request Jun 22, 2016
rubiojr added a commit that referenced this pull request Jun 22, 2016
A patch release that includes performance improvements for cluster
restores, bug fixes and other improvements:

* git-hooks fixes #231
* Cluster: speedup repositories restore #232
* Cluster: restore Git over SSH keys #230
* Benchmark restores #219
rubiojr added a commit that referenced this pull request Jun 22, 2016
A patch release that includes performance improvements for cluster
restores, bug fixes and other improvements:

* git-hooks fixes #231
* Cluster: speedup repositories restore #232
* Cluster: restore Git over SSH keys #230
* Benchmark restores #219
rubiojr added a commit that referenced this pull request Jun 22, 2016
A patch release that includes performance improvements for cluster
restores, bug fixes and other improvements:

* git-hooks fixes #231
* Cluster: speedup repositories restore #232
* Cluster: restore Git over SSH keys #230
* Benchmark restores #219
dooleydevin pushed a commit that referenced this pull request Nov 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants