ELF: Switch to parallelSort for RELR relocations.#138370
Conversation
Created using spr 1.3.6-beta.1
|
@llvm/pr-subscribers-lld-elf Author: Peter Collingbourne (pcc) ChangesFor firefox-x64 one of the more time consuming parts Full diff: https://github.com/llvm/llvm-project/pull/138370.diff 1 Files Affected:
diff --git a/lld/ELF/SyntheticSections.cpp b/lld/ELF/SyntheticSections.cpp
index 2531227cb99b7..eceb297dbfc0d 100644
--- a/lld/ELF/SyntheticSections.cpp
+++ b/lld/ELF/SyntheticSections.cpp
@@ -2111,7 +2111,7 @@ template <class ELFT> bool RelrSection<ELFT>::updateAllocSize(Ctx &ctx) {
std::unique_ptr<uint64_t[]> offsets(new uint64_t[relocs.size()]);
for (auto [i, r] : llvm::enumerate(relocs))
offsets[i] = r.getOffset();
- llvm::sort(offsets.get(), offsets.get() + relocs.size());
+ llvm::parallelSort(offsets.get(), offsets.get() + relocs.size());
// For each leading relocation, find following ones that can be folded
// as a bitmap and fold them.
|
|
@llvm/pr-subscribers-lld Author: Peter Collingbourne (pcc) ChangesFor firefox-x64 one of the more time consuming parts Full diff: https://github.com/llvm/llvm-project/pull/138370.diff 1 Files Affected:
diff --git a/lld/ELF/SyntheticSections.cpp b/lld/ELF/SyntheticSections.cpp
index 2531227cb99b7..eceb297dbfc0d 100644
--- a/lld/ELF/SyntheticSections.cpp
+++ b/lld/ELF/SyntheticSections.cpp
@@ -2111,7 +2111,7 @@ template <class ELFT> bool RelrSection<ELFT>::updateAllocSize(Ctx &ctx) {
std::unique_ptr<uint64_t[]> offsets(new uint64_t[relocs.size()]);
for (auto [i, r] : llvm::enumerate(relocs))
offsets[i] = r.getOffset();
- llvm::sort(offsets.get(), offsets.get() + relocs.size());
+ llvm::parallelSort(offsets.get(), offsets.get() + relocs.size());
// For each leading relocation, find following ones that can be folded
// as a bitmap and fold them.
|
|
(Still on a trip with limited computer access) We call Using However, updateAllocSize should be fine, as it’s called within finalizeAddressDependentContent without nested parallelism requirements. |
For firefox-x64 one of the more time consuming parts
of finalizeSections() was the call to llvm::sort in
RelrSection::updateAllocSize(). Switching that to use parallelSort
yielded the following improvement on firefox-x64 with ldflags -S on
an Apple M2 Ultra: