From d4f2ad630333cd2510ef7e9d0cbdae6790238330 Mon Sep 17 00:00:00 2001 From: Nils Wireklint Date: Tue, 16 Sep 2025 14:40:47 +0200 Subject: [PATCH 1/6] Doc: revamp the terminology: we create the toprepo --- doc/terminology.md | 183 ++++++++++++++++++++------------------------- 1 file changed, 83 insertions(+), 100 deletions(-) diff --git a/doc/terminology.md b/doc/terminology.md index 6528271..463492b 100644 --- a/doc/terminology.md +++ b/doc/terminology.md @@ -1,7 +1,9 @@ # Terminology overview -This describes the terms involved in using `git-toprepo`, the tool, -to emulate a monorepo for a toprepo and its submodules. +This describes the terms involved in using the `git-toprepo` tool +to create a _toprepo_ for a _superrepo_ and its _submodules_. +this _combines_ the history of all _repositories_ +into one _emulated monorepo_. ## Terms @@ -9,34 +11,32 @@ to emulate a monorepo for a toprepo and its submodules. a _repository_. May be local or on a remote server. **git submodule**: A core `git` concept, -a _submodule_ is a _repository_ with a child-parent relation ship to another. +a _submodule_ is a _repository_ with a child-parent relationship to another. **regular submodule**: A core `git` concept, a regular _submodule_ that is entirely managed through `git-submodule` etc. -**filtered submodule**: A `git-toprepo` concept, -a _submodule_ that has been assimilated into one combined history in the filtered _monorepo_. +**assimilated submodule**: A `git-toprepo` concept, +a _submodule_ that has been assimilated into the _combined_ history in the _toprepo_. **superrepo**: Emergent from core git concepts, the parent _repository_ to a _submodule_. It may be a _submodule_ to another _superrepo_. **git-toprepo**: The tool itself. -`git-toprepo` filters a _toprepo_ +`git-toprepo` combines a _repository_ and some of its _submodules_ -into a _monorepo_ (emulated). -Takes care to push filtered _submodules_ to their remote server. +into a _toprepo_, an _emulated monorepo_. +Takes care to push _assimilated submodules_ to their remote server. -**toprepo**: A _repository_ with _submodules_. -This is the main development _repository_ for a developer. -the _toprepo_ is the root level _superrepo_ -in a potential hierarchy of multiple levels of _submodules_. +**rootrepo**: Emergent from core git concepts, +a _repository_ that is not a _submodule_ to another _repository_. +This is the main development _repository_ for a developer, +it often has _submodules_. -It may either be checked out with **regular** `git-submodule init --recursive` -or with `git-toprepo` to create a _monorepo_. -If it is checked out with `git-toprepo` -some _soubmodules_ may not be filtered into the _monorepo_, -then those must be manipulated with `git-submodule` as in the first case. +It may either be checked out with _regular submodules_: +`git-submodule init --recursive` +or as a _toprepo_ with `git-toprepo`. **monorepo**: A _repository_ with all the code, it does not typically have _submodules_. @@ -47,23 +47,20 @@ of first party code. Gives unparalleled reproducibility and understanding of the full product. -Throughout `git-toprepo`'s code and documentation -_monorepo_ is often used to refer to an _emulated monorepo_, for conciseness. - **pure monorepo**: A commonly sought concept, such a _repository_ does not have _submodules_ at all. There is just one _repository_ on the remote `git` server. This realizes the full value of a _monorepo_, but has no clear _access control_. -**emulated monorepo**: A client side construct -that emulates a _monorepo_ for developer +**toprepo**: A client-side construct +that _emulates_ a _monorepo_ for developers but still tracks code as _submodules_ with their own remote git _repositories_. This is created by `git-toprepo`. As a performance optimization a _monorepo_ created by `git-toprepo` -may still have _submodules_ though, -if the user does not want to assimilate all _submodules_. +may still have _regular submodules_ though, +if the user does not want to combine all _submodules_. **submodule access control**: One can easily apply access control to individual _submodules_ by restricting access to their git _repositories_. @@ -71,32 +68,29 @@ Such access control is not possible for different directories in a _pure monorep **commit**: A core `git` concept. -**monocommit**: A `git-toprepo` concept, -a commit in the _emulated monorepo_ for the _toprepo_. -May consist of multiple _commits_ in multiple _filtered submodules_. +**topcommit**: A `git-toprepo` concept, +a commit in the _toprepo_. `git-toprepo` shines when a developer wants to make one change across two _submodules_ -and can track that as one _supercommit_ +and can track that as one _topcommit_ -- one _commit_ in the _emulated monorepo_ that consists of one _commit_ in each of the two _submodules_. Those are meant to be merged together -through compatible CI systems that allow _shared gating_ between _repositories_. +through compatible CI systems that allow _shared gating_ between the constituent _repositories_. **shared gating**: A CI system concept. CI systems like `Gerrit` allows an organization to merge code to multiple _repositories_ atomically if all tests passes. -This allows us to emaulate a _monorepo_ and have a shared gate. -`Gerrit` uses [superproject subscription] for this +This allows the _toprepo_ to _emulate_ a _monorepo_ and have a shared gate. +`Gerrit` uses [superproject subscription] for this. [superproject subscription]: https://gerrit-review.googlesource.com/Documentation/user-submodules.html ### Verbs -**filter**: `git-toprepo` filters the history of one _toprepo_ and its _regular submodules_ -into an _emulated monorepo_ with a combined history for all the _toprepo_ itself and its _filtered submodules_. - -**combined**: `git-toprepo` has _combined_ the history into an _emulated monorepo_ with combined history. +**combine**: `git-toprepo` combines the history of one _rootrepo_ and (some of) its _submodules_ +into _toprepo_ with a combined history for code in the _rootrepo_ itself and its _assimilated submodules_. -**manage**: `git-toprepo` manages a git _toprepo_ and has _expanded_ the history into an _emulated monorepo_. +**assimilate**: `git-toprepo` has _assimilated_ a _submodule_ into the _combined_ history. ### Technical details @@ -111,113 +105,102 @@ to make it easy to create custom tools for `git`. ## Examples -### Initialization: The toprepo may be a monorepo +### Initialization: Create a toprepo for a rootrepo -The configuration of a _monorepo_ is often managed in the _toprepo_ and is already checked in. +A _rootrepo_ can be initialized to become a _toprepo_ with `git-toprepo`. +The configuration of the _toprepo_ is often managed in the _rootrepo_ and is already checked in. -Short-form initialization of a _monorepo_. +Short-form initialization of a _toprepo_. ``` -$ monorepo $ git toprepo clone ssh://gerrit.example/toprepo.git monorepo -$ cd monorepo -monorepo $ # This is a monorepo. +$ toprepo $ git toprepo clone ssh://gerrit.example/rootrepo.git toprepo +$ cd toprepo +toprepo $ # This is a toprepo. ``` - - - - - - - - - However, the code can also be checked out with regular git _submodules_. ``` -$ git clone ssh://gerrit.example/toprepo.git -$ cd toprepo -toprepo $ git submodule init --recursive -toprepo $ # This is not a monorepo +$ git clone ssh://gerrit.example/rootrepo.git +$ cd rootrepo +rootrepo $ git submodule init --recursive +rootrepo $ # This is not a toprepo. ``` -### Initialization: Some submodules are not filtered in +### Initialization: Some submodules are not assimilated Now imagine that the _toprepo_ has one _submodule_ with a long and weird history, it may be binary data that takes a lot of space and is not relevant to the developer. -Then it is often **not filtered** into the _emulated monorepo_. +Then it is often not _assimilated_ into the _toprepo_. -_monorepo_: +_toprepo_: ``` -$ monorepo $ git toprepo clone ssh://gerrit.example/toprepo.git monorepo -$ cd monorepo -monorepo $ # This is a monorepo. -monorepo $ git submodule status +$ toprepo $ git toprepo clone ssh://gerrit.example/rootrepo.git toprepo +$ cd toprepo +toprepo $ # This is a toprepo. +toprepo $ git submodule status -4e04771fcf658500987d0be5a9a63f8e77d5e386 binary_data_module ``` -regular _toprepo_: +regular _rootrepo_: ``` -$ git clone ssh://gerrit.example/toprepo.git -$ cd toprepo -toprepo $ git submodule status +$ git clone ssh://gerrit.example/rootrepo.git +$ cd rootrepo +rootrepo $ git submodule status -4e04771fcf658500987d0be5a9a63f8e77d5e386 binary_data_module -661c1b2d568693e3b6b631ae66f6872b194674f1 source_code_module ``` -### Pushing: git-toprepo pushes filtered submodules to their servers +### Pushing: git-toprepo pushes assimilated submodules to their servers `git-toprepo` shines when a developer wants to make one change across two _submodules_ -in one _supercommit_. +in one _topcommit_. ``` -monorepo $ # modify one/file and two/file -monorepo $ git add one/file two/file; git commit -monorepo $ git-toprepo push HEAD:refs/for/main +toprepo $ # modify one/file and two/file +toprepo $ git add one/file two/file; git commit +toprepo $ git-toprepo push HEAD:refs/for/main ``` -This pushes the two paths inside the _monorepo_ to their constituent +This pushes the two paths inside the _toprepo_ to their constituent _repositories_ on the git server (gerrit.example/one.git and gerrit.example/two.git). -The regular workflow with submodules, however, is more involved +The regular workflow with _submodules_, however, is more involved ``` -toprepo $ # modify one/file and two/file -toprepo $ git -C one add file; git commit -toprepo $ git -C two add file; git commit -toprepo $ git -C one push HEAD:refs/for/main -toprepo $ git -C two push HEAD:refs/for/main -# As you use Gerrit's superproject subscription, you would not need a toprepo commit: -# toprepo $ git add one two; git commit -# toprepo $ git push HEAD:refs/for/main +rootrepo $ # modify one/file and two/file +rootrepo $ git -C one add file; git commit +rootrepo $ git -C two add file; git commit +rootrepo $ git -C one push HEAD:refs/for/main +rootrepo $ git -C two push HEAD:refs/for/main +# As you use Gerrit's superproject subscription, you would not need a rootrepo commit: +# rootrepo $ git add one two; git commit +# rootrepo $ git push HEAD:refs/for/main ``` -First the two _submodules_ are handled separately -then the _toprepo_ must also bump its _submodule_ pointers to the new commits within them. - > [!NOTE] -> Though committing inside _regular submodules_ in a _monorepo_ is rare. -> If a _submodule_'s history is not relevant to _filter_ into the combined history +> Though committing inside _regular submodules_ in a _toprepo_ is rare. +> If a _submodule_'s history is not relevant to _combine_ into the _combined_ history > it is unlikely that developers need to modify the code and make changes. ### Rebasing: git-toprepo gives a shared history that is easy to work with -With `git-toprepo`, rebasing _commits_ in any of the _filtered submodules_ +With `git-toprepo`, rebasing _commits_ in any of the _assimilated submodules_ is as easy as working in a single _repository_. ``` -monorepo $ git-toprepo fetch origin -monorepo $ git rebase -i origin/main +toprepo $ git-toprepo fetch origin +toprepo $ git rebase -i origin/main ``` -However when using _regular submodules_ in an _unmanaged_ _toprepo_ +However when using _regular submodules_ in an _repository_ one needs to automate the workflow within individual _submodules_. ``` -toprepo $ git fetch origin -toprepo $ git rebase -i origin/main -toprepo $ submod_commit_hash=$(git ls-files --stage -- one | cut -d' ' -f2) -toprepo $ git -C one rebase -i "$submod_commit_hash" -toprepo $ submod_commit_hash=$(git ls-files --stage -- two | cut -d' ' -f2) -toprepo $ git -C two rebase -i "$submod_commit_hash" +rootrepo $ git fetch origin +rootrepo $ git rebase -i origin/main +rootrepo $ submod_commit_hash=$(git ls-files --stage -- one | cut -d' ' -f2) +rootrepo $ git -C one rebase -i "$submod_commit_hash" +rootrepo $ submod_commit_hash=$(git ls-files --stage -- two | cut -d' ' -f2) +rootrepo $ git -C two rebase -i "$submod_commit_hash" ``` In the example, two _submodules_ does not look too bad at the face of it, @@ -225,13 +208,13 @@ but note that the rebasing is not synchronized between the _submodules_. Therefore, building and testing the code after resolving a merge conflict, which may have only occurred in one _submodule_, is not trivial. -### Pushing: Push all submodules of an emulated monorepo +### Pushing: Push all submodules of a toprepo -As an _emulated monorepo_ may not have _expanded_ all _submodules_ into the combined history +As a _toprepo_ may not have _combined_ all _submodules_ into the history some _submodules_ are left as _regular submodules_. So to always push changes to all _submodules_ the following invocation is needed: ``` -monorepo $ git-toprepo push HEAD:refs/for/main -monorepo $ git submodule for each push HEAD:refs/for/main +toprepo $ git-toprepo push HEAD:refs/for/main +toprepo $ git submodule for each push HEAD:refs/for/main ``` From 71c8b5688293b78a82cb65c7bc2a85c424963788 Mon Sep 17 00:00:00 2001 From: Nils Wireklint Date: Tue, 16 Sep 2025 15:20:50 +0200 Subject: [PATCH 2/6] doc: revamp the terminology: avoid rootrepo --- doc/terminology.md | 44 +++++++++++++++++++++++--------------------- 1 file changed, 23 insertions(+), 21 deletions(-) diff --git a/doc/terminology.md b/doc/terminology.md index 463492b..a6b805d 100644 --- a/doc/terminology.md +++ b/doc/terminology.md @@ -29,15 +29,6 @@ and some of its _submodules_ into a _toprepo_, an _emulated monorepo_. Takes care to push _assimilated submodules_ to their remote server. -**rootrepo**: Emergent from core git concepts, -a _repository_ that is not a _submodule_ to another _repository_. -This is the main development _repository_ for a developer, -it often has _submodules_. - -It may either be checked out with _regular submodules_: -`git-submodule init --recursive` -or as a _toprepo_ with `git-toprepo`. - **monorepo**: A _repository_ with all the code, it does not typically have _submodules_. This makes it easy to make changes across different components @@ -87,8 +78,8 @@ This allows the _toprepo_ to _emulate_ a _monorepo_ and have a shared gate. ### Verbs -**combine**: `git-toprepo` combines the history of one _rootrepo_ and (some of) its _submodules_ -into _toprepo_ with a combined history for code in the _rootrepo_ itself and its _assimilated submodules_. +**combine**: `git-toprepo` combines the history of one _superrepo_ and (some of) its _submodules_ +into _toprepo_ with a combined history for code in the _superrepo_ itself and its _assimilated submodules_. **assimilate**: `git-toprepo` has _assimilated_ a _submodule_ into the _combined_ history. @@ -103,26 +94,37 @@ For power users and _repository_ maintainers there are a few overlapping concept `git` runs external subcommands like `git-` as `git ` to make it easy to create custom tools for `git`. +### Technical terms in the code + +**rootrepo**: Emergent from core git concepts, +a _repository_ that is not a _submodule_ to another _repository_. +This is the main development _repository_ for a developer, +it often has _submodules_. + +It may either be checked out with _regular submodules_: +`git-submodule init --recursive` +or as a _toprepo_ with `git-toprepo`. + ## Examples -### Initialization: Create a toprepo for a rootrepo +### Initialization: Create a toprepo for a repository -A _rootrepo_ can be initialized to become a _toprepo_ with `git-toprepo`. -The configuration of the _toprepo_ is often managed in the _rootrepo_ and is already checked in. +A _repository_ can be initialized to become a _toprepo_ with `git-toprepo`. +The configuration of the _toprepo_ is often managed in the _repository_ and is already checked in. Short-form initialization of a _toprepo_. ``` -$ toprepo $ git toprepo clone ssh://gerrit.example/rootrepo.git toprepo +$ toprepo $ git toprepo clone ssh://gerrit.example/substrate.git toprepo $ cd toprepo toprepo $ # This is a toprepo. ``` However, the code can also be checked out with regular git _submodules_. ``` -$ git clone ssh://gerrit.example/rootrepo.git -$ cd rootrepo -rootrepo $ git submodule init --recursive -rootrepo $ # This is not a toprepo. +$ git clone ssh://gerrit.example/substrate.git +$ cd substrate +substrate $ git submodule init --recursive +substrate $ # This is not a toprepo. ``` ### Initialization: Some submodules are not assimilated @@ -133,14 +135,14 @@ Then it is often not _assimilated_ into the _toprepo_. _toprepo_: ``` -$ toprepo $ git toprepo clone ssh://gerrit.example/rootrepo.git toprepo +$ toprepo $ git toprepo clone ssh://gerrit.example/substrate.git toprepo $ cd toprepo toprepo $ # This is a toprepo. toprepo $ git submodule status -4e04771fcf658500987d0be5a9a63f8e77d5e386 binary_data_module ``` -regular _rootrepo_: +regular _repository_: ``` $ git clone ssh://gerrit.example/rootrepo.git $ cd rootrepo From b9ad522b77f5cedc789d458087d704d0bfca9600 Mon Sep 17 00:00:00 2001 From: Nils Wireklint Date: Tue, 16 Sep 2025 15:36:41 +0200 Subject: [PATCH 3/6] doc: revamp the terminology: though there is merit to discussing the underlying root repo --- doc/terminology.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/doc/terminology.md b/doc/terminology.md index a6b805d..534d2a1 100644 --- a/doc/terminology.md +++ b/doc/terminology.md @@ -105,6 +105,11 @@ It may either be checked out with _regular submodules_: `git-submodule init --recursive` or as a _toprepo_ with `git-toprepo`. +**rootcommit**: Commits in the _rootrepo_'s remote git server +they are part of _topcommit_s. +These are fetched in `git-toprepo fetch` +these are formed when pushing new work with `git-toprepo push`. + ## Examples ### Initialization: Create a toprepo for a repository @@ -220,3 +225,16 @@ So to always push changes to all _submodules_ the following invocation is needed toprepo $ git-toprepo push HEAD:refs/for/main toprepo $ git submodule for each push HEAD:refs/for/main ``` +### Combination algorithm: + +This birefly outlines the _combination_ algorithm that creates the _toprepo_. +To further contextualize the pieces and their relationships. + +#### Fetch a rootrepo commit and create a topcommit + +`git-toprepo fetch` first fetches the _regular_ _commit_ (_rootcommit_) for the _rootrepo_ itself +`git fetch ...`. +Then finds any _submodules_ that are bumped through Gerrit's _superproject subscription_ +and fetches their _regular_ _commits_. +All the _regular_ _commits_ in the _rootrepo_ and the _assimilated submodules_ +are _combined_ into one _topcommit_. From 028e3acebea56e91d2ddd6ac550315139439d58c Mon Sep 17 00:00:00 2001 From: Nils Wireklint Date: Thu, 18 Sep 2025 11:51:21 +0200 Subject: [PATCH 4/6] Doc: terminology: revamp to the most precise wording --- doc/terminology.md | 183 +++++++++++++++++++++++++-------------------- 1 file changed, 103 insertions(+), 80 deletions(-) diff --git a/doc/terminology.md b/doc/terminology.md index 534d2a1..81c18c7 100644 --- a/doc/terminology.md +++ b/doc/terminology.md @@ -1,9 +1,8 @@ # Terminology overview This describes the terms involved in using the `git-toprepo` tool -to create a _toprepo_ for a _superrepo_ and its _submodules_. -this _combines_ the history of all _repositories_ -into one _emulated monorepo_. +to create an _emulated monorepo_ for a _toprepo_ and its _submodules_. +this _combines_ the history of all _repositories_. ## Terms @@ -23,9 +22,21 @@ a _submodule_ that has been assimilated into the _combined_ history in the _topr the parent _repository_ to a _submodule_. It may be a _submodule_ to another _superrepo_. +**toprepo**: A regular _repository_ with special configuration and purpose. +It is meant to be used together with `git-toprepo` to _combine_ its _submodules_ +to an _emulated monorepo_. +This is generally configured by the organization +but the user may have her own configuration for personal preferences. +There is generally only one such repo +so it is often described in definite form: "the _toprepo_". + +It can also be checked out with _regular submodules_: +`git-submodule init --recursive` +but it is not the preferred development workflow. + **git-toprepo**: The tool itself. `git-toprepo` combines a _repository_ -and some of its _submodules_ +and (a choice of) its _submodules_ into a _toprepo_, an _emulated monorepo_. Takes care to push _assimilated submodules_ to their remote server. @@ -44,14 +55,15 @@ There is just one _repository_ on the remote `git` server. This realizes the full value of a _monorepo_, but has no clear _access control_. -**toprepo**: A client-side construct -that _emulates_ a _monorepo_ for developers -but still tracks code as _submodules_ with their own remote git _repositories_. -This is created by `git-toprepo`. +**emulated monorepo**: A client-side construct +that _emulates_ a _monorepo_ for a _toprepo_. +The developer sees a joint history of all _submodules_ and can create _monocommits_ +that span multiple _submodules_ and push/fetch them with `git-toprepo`. +The tool keeps track of the _assimilated submodules_ with their own remote git _repositories_. As a performance optimization a _monorepo_ created by `git-toprepo` may still have _regular submodules_ though, -if the user does not want to combine all _submodules_. +if the user does not want to _combine_ all _submodules_. **submodule access control**: One can easily apply access control to individual _submodules_ by restricting access to their git _repositories_. @@ -59,11 +71,11 @@ Such access control is not possible for different directories in a _pure monorep **commit**: A core `git` concept. -**topcommit**: A `git-toprepo` concept, -a commit in the _toprepo_. +**monocommit**: A `git-toprepo` concept, +a commit in the _emulated monorepo_. `git-toprepo` shines when a developer wants to make one change across two _submodules_ -and can track that as one _topcommit_ +and can track that as one _monocommit_ -- one _commit_ in the _emulated monorepo_ that consists of one _commit_ in each of the two _submodules_. Those are meant to be merged together through compatible CI systems that allow _shared gating_ between the constituent _repositories_. @@ -71,17 +83,22 @@ through compatible CI systems that allow _shared gating_ between the constituent **shared gating**: A CI system concept. CI systems like `Gerrit` allows an organization to merge code to multiple _repositories_ atomically if all tests passes. -This allows the _toprepo_ to _emulate_ a _monorepo_ and have a shared gate. +This allows the shared gating of the constituent _submodules_. +So the merged history is always compatible with an _emulated monorepo_, +there are no race conditions between different _repository_ gates. `Gerrit` uses [superproject subscription] for this. [superproject subscription]: https://gerrit-review.googlesource.com/Documentation/user-submodules.html ### Verbs -**combine**: `git-toprepo` combines the history of one _superrepo_ and (some of) its _submodules_ -into _toprepo_ with a combined history for code in the _superrepo_ itself and its _assimilated submodules_. +**combine**: `git-toprepo` _combines_ the history of one _toprepo_ and (some of) its _submodules_ +into an _emulated monorepo_ with a _combined_ history for code in the _toprepo_ itself and its _assimilated submodules_. -**assimilate**: `git-toprepo` has _assimilated_ a _submodule_ into the _combined_ history. +**assimilate**: `git-toprepo` has _assimilated_ a _submodule_ into the _combined_ _emulated monorepo_ history. + +**expand**: The _toprepo_ has been expanded to an _emulated monorepo_. +This verb is not used often but avoids the mention of _submodules_. ### Technical details @@ -96,62 +113,62 @@ to make it easy to create custom tools for `git`. ### Technical terms in the code -**rootrepo**: Emergent from core git concepts, -a _repository_ that is not a _submodule_ to another _repository_. -This is the main development _repository_ for a developer, -it often has _submodules_. - -It may either be checked out with _regular submodules_: -`git-submodule init --recursive` -or as a _toprepo_ with `git-toprepo`. - -**rootcommit**: Commits in the _rootrepo_'s remote git server -they are part of _topcommit_s. +**topcommit**: Commits in the _toprepo_'s own remote git server. These are fetched in `git-toprepo fetch` -these are formed when pushing new work with `git-toprepo push`. +these are also formed when pushing new work with `git-toprepo push` +if changes were made to the underlying _toprepo_, +symmetric with _regular commits_ for the constituent _submodules_ +that are pushed to the _submodules_' remote git servers. + +**monorepo**: In the code we use "_monorepo_" as short-hand notation instead of +"_emulated monorepo_". As the code has no use in a "_pure monorepo_" context. +So the brevity is placed over preciseness of the term within the code. ## Examples -### Initialization: Create a toprepo for a repository +### Initialization: expand the toprepo to an emulated monorepo -A _repository_ can be initialized to become a _toprepo_ with `git-toprepo`. -The configuration of the _toprepo_ is often managed in the _repository_ and is already checked in. +The _toprepo_ can be initialized to an _emulated monorepo_ with `git-toprepo`. +The configuration of the _emulated monorepo_ +is often managed in the _toprepo_ itself and is already checked in. -Short-form initialization of a _toprepo_. +Short-form initialization of the _emulated monorepo_. ``` -$ toprepo $ git toprepo clone ssh://gerrit.example/substrate.git toprepo -$ cd toprepo -toprepo $ # This is a toprepo. +$ git toprepo clone ssh://gerrit.example/toprepo.git emulated-monorepo +$ cd emulated-monorepo +emulated-monorepo $ # This is an emulated monorepo. ``` However, the code can also be checked out with regular git _submodules_. ``` -$ git clone ssh://gerrit.example/substrate.git -$ cd substrate -substrate $ git submodule init --recursive -substrate $ # This is not a toprepo. +$ git clone ssh://gerrit.example/toprepo.git +$ cd toprepo +toprepo $ git submodule init --recursive +toprepo $ # This is not an emulated monorepo. ``` ### Initialization: Some submodules are not assimilated Now imagine that the _toprepo_ has one _submodule_ with a long and weird history, it may be binary data that takes a lot of space and is not relevant to the developer. -Then it is often not _assimilated_ into the _toprepo_. +Then it is often not _assimilated_ into the _emulated monorepo_. -_toprepo_: +_emulated monorepo_: ``` -$ toprepo $ git toprepo clone ssh://gerrit.example/substrate.git toprepo -$ cd toprepo -toprepo $ # This is a toprepo. -toprepo $ git submodule status +$ git toprepo clone ssh://gerrit.example/toprepo.git emulated-monorepo +$ cd emulated-monorepo +emulated-monorepo $ # This is an emulated monorepo. +monorepo $ git submodule status -4e04771fcf658500987d0be5a9a63f8e77d5e386 binary_data_module ``` regular _repository_: ``` -$ git clone ssh://gerrit.example/rootrepo.git -$ cd rootrepo -rootrepo $ git submodule status +$ git clone ssh://gerrit.example/toprepo.git +$ cd toprepo +toprepo $ git submodule init --recursive +toprepo $ # This is not an emulated monorepo. +toprepo $ git submodule status -4e04771fcf658500987d0be5a9a63f8e77d5e386 binary_data_module -661c1b2d568693e3b6b631ae66f6872b194674f1 source_code_module ``` @@ -162,30 +179,31 @@ rootrepo $ git submodule status in one _topcommit_. ``` -toprepo $ # modify one/file and two/file -toprepo $ git add one/file two/file; git commit -toprepo $ git-toprepo push HEAD:refs/for/main +emulated-monorepo $ # modify one/file and two/file +emulated-monorepo $ git add one/file two/file; git commit +emulated-monorepo $ git-toprepo push HEAD:refs/for/main ``` -This pushes the two paths inside the _toprepo_ to their constituent +This pushes the two paths inside the _emulated monorepo_ to their constituent _repositories_ on the git server (gerrit.example/one.git and gerrit.example/two.git). The regular workflow with _submodules_, however, is more involved ``` -rootrepo $ # modify one/file and two/file -rootrepo $ git -C one add file; git commit -rootrepo $ git -C two add file; git commit -rootrepo $ git -C one push HEAD:refs/for/main -rootrepo $ git -C two push HEAD:refs/for/main -# As you use Gerrit's superproject subscription, you would not need a rootrepo commit: -# rootrepo $ git add one two; git commit -# rootrepo $ git push HEAD:refs/for/main +toprepo $ # modify one/file and two/file +toprepo $ git -C one add file; git commit +toprepo $ git -C two add file; git commit +toprepo $ git -C one push HEAD:refs/for/main +toprepo $ git -C two push HEAD:refs/for/main +# Because you use Gerrit's superproject subscription (otherwise git-toprepo does not work), +# you would not need a toprepo commit: +# toprepo $ git add one two; git commit +# toprepo $ git push HEAD:refs/for/main ``` > [!NOTE] -> Though committing inside _regular submodules_ in a _toprepo_ is rare. -> If a _submodule_'s history is not relevant to _combine_ into the _combined_ history +> Though committing inside _regular submodules_ in an _emulated monorepo_ is rare. +> If a _submodule_'s history is not relevant to _assimilate_ into the _combined_ history > it is unlikely that developers need to modify the code and make changes. ### Rebasing: git-toprepo gives a shared history that is easy to work with @@ -194,20 +212,20 @@ With `git-toprepo`, rebasing _commits_ in any of the _assimilated submodules_ is as easy as working in a single _repository_. ``` -toprepo $ git-toprepo fetch origin -toprepo $ git rebase -i origin/main +emulated-monorepo $ git-toprepo fetch origin +emulated-monorepo $ git rebase -i origin/main ``` However when using _regular submodules_ in an _repository_ one needs to automate the workflow within individual _submodules_. ``` -rootrepo $ git fetch origin -rootrepo $ git rebase -i origin/main -rootrepo $ submod_commit_hash=$(git ls-files --stage -- one | cut -d' ' -f2) -rootrepo $ git -C one rebase -i "$submod_commit_hash" -rootrepo $ submod_commit_hash=$(git ls-files --stage -- two | cut -d' ' -f2) -rootrepo $ git -C two rebase -i "$submod_commit_hash" +toprepo $ git fetch origin +toprepo $ git rebase -i origin/main +toprepo $ submod_commit_hash="$(git ls-files --stage -- one | cut -d' ' -f2)" +toprepo $ git -C one rebase -i "$submod_commit_hash" +toprepo $ submod_commit_hash="$(git ls-files --stage -- two | cut -d' ' -f2)" +toprepo $ git -C two rebase -i "$submod_commit_hash" ``` In the example, two _submodules_ does not look too bad at the face of it, @@ -217,24 +235,29 @@ which may have only occurred in one _submodule_, is not trivial. ### Pushing: Push all submodules of a toprepo -As a _toprepo_ may not have _combined_ all _submodules_ into the history +As an _emulated monorepo_:_ may not have _combined_ all _submodules_ into the history some _submodules_ are left as _regular submodules_. So to always push changes to all _submodules_ the following invocation is needed: ``` -toprepo $ git-toprepo push HEAD:refs/for/main -toprepo $ git submodule for each push HEAD:refs/for/main +emulated-monorepo $ git-toprepo push HEAD:refs/for/main +emulated-monorepo $ git submodule for each push HEAD:refs/for/main ``` + +> [!NOTE] +> Recall that committing inside _regular submodules_ in an _emulated monorepo_ is rare. + ### Combination algorithm: -This birefly outlines the _combination_ algorithm that creates the _toprepo_. -To further contextualize the pieces and their relationships. +This briefly outlines the _combination_ algorithm +that creates the _shared history_ of the _emulated monorepo_ +to further contextualize the pieces and their relationships. -#### Fetch a rootrepo commit and create a topcommit +#### Fetch a toprepo commit and create a monocommit -`git-toprepo fetch` first fetches the _regular_ _commit_ (_rootcommit_) for the _rootrepo_ itself +`git-toprepo fetch` first fetches the _regular commit_ (_topcommit_) for the _toprepo_ itself `git fetch ...`. Then finds any _submodules_ that are bumped through Gerrit's _superproject subscription_ -and fetches their _regular_ _commits_. -All the _regular_ _commits_ in the _rootrepo_ and the _assimilated submodules_ -are _combined_ into one _topcommit_. +and fetches their _regular commits_. +All the _regular commits_ in the _rootrepo_ and the _assimilated submodules_ +are _combined_ into one _monocommit_. From b8ddadf1c5ec745ea93db2719ba1d229fdc4e126 Mon Sep 17 00:00:00 2001 From: Nils Wireklint Date: Thu, 18 Sep 2025 12:06:24 +0200 Subject: [PATCH 5/6] Doc: terminology: assemble is more precis in some contexts --- doc/terminology.md | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/doc/terminology.md b/doc/terminology.md index 81c18c7..94d3a64 100644 --- a/doc/terminology.md +++ b/doc/terminology.md @@ -1,7 +1,7 @@ # Terminology overview This describes the terms involved in using the `git-toprepo` tool -to create an _emulated monorepo_ for a _toprepo_ and its _submodules_. +to _assemble_ an _emulated monorepo_ for a _toprepo_ and its _submodules_. this _combines_ the history of all _repositories_. ## Terms @@ -16,28 +16,29 @@ a _submodule_ is a _repository_ with a child-parent relationship to another. a regular _submodule_ that is entirely managed through `git-submodule` etc. **assimilated submodule**: A `git-toprepo` concept, -a _submodule_ that has been assimilated into the _combined_ history in the _toprepo_. +a _submodule_ that has been _assembled_ into the _combined_ history in the _toprepo_. **superrepo**: Emergent from core git concepts, the parent _repository_ to a _submodule_. It may be a _submodule_ to another _superrepo_. **toprepo**: A regular _repository_ with special configuration and purpose. -It is meant to be used together with `git-toprepo` to _combine_ its _submodules_ +It is meant to be used together with `git-toprepo` to _assemble_ itself and its _submodules_ to an _emulated monorepo_. This is generally configured by the organization but the user may have her own configuration for personal preferences. -There is generally only one such repo -so it is often described in definite form: "the _toprepo_". It can also be checked out with _regular submodules_: `git-submodule init --recursive` but it is not the preferred development workflow. +There is generally only one such _repository_ +so it is often described in definite form: "the _toprepo_". + **git-toprepo**: The tool itself. -`git-toprepo` combines a _repository_ +`git-toprepo` _assembles_ a _toprepo_ and (a choice of) its _submodules_ -into a _toprepo_, an _emulated monorepo_. +into an _emulated monorepo_. Takes care to push _assimilated submodules_ to their remote server. **monorepo**: A _repository_ with all the code, @@ -55,7 +56,7 @@ There is just one _repository_ on the remote `git` server. This realizes the full value of a _monorepo_, but has no clear _access control_. -**emulated monorepo**: A client-side construct +**emulated monorepo**: A client-side construct, _assembly_, that _emulates_ a _monorepo_ for a _toprepo_. The developer sees a joint history of all _submodules_ and can create _monocommits_ that span multiple _submodules_ and push/fetch them with `git-toprepo`. @@ -76,7 +77,7 @@ a commit in the _emulated monorepo_. `git-toprepo` shines when a developer wants to make one change across two _submodules_ and can track that as one _monocommit_ --- one _commit_ in the _emulated monorepo_ that consists of one _commit_ in each of the two _submodules_. +-- one _commit_ in the _emulated monorepo_ that consists of one _commit_ in each of the two _assimilated submodules_. Those are meant to be merged together through compatible CI systems that allow _shared gating_ between the constituent _repositories_. @@ -92,6 +93,8 @@ there are no race conditions between different _repository_ gates. ### Verbs +**assemble**: `git-toprepo` _assembles_ a _toprepo_ and its _submodules_ into an _emulated monorepo_. + **combine**: `git-toprepo` _combines_ the history of one _toprepo_ and (some of) its _submodules_ into an _emulated monorepo_ with a _combined_ history for code in the _toprepo_ itself and its _assimilated submodules_. @@ -99,6 +102,7 @@ into an _emulated monorepo_ with a _combined_ history for code in the _toprepo_ **expand**: The _toprepo_ has been expanded to an _emulated monorepo_. This verb is not used often but avoids the mention of _submodules_. + ### Technical details @@ -113,12 +117,12 @@ to make it easy to create custom tools for `git`. ### Technical terms in the code -**topcommit**: Commits in the _toprepo_'s own remote git server. +**topcommit**: Commits in the _toprepo_'s own remote git _repository_. These are fetched in `git-toprepo fetch` these are also formed when pushing new work with `git-toprepo push` if changes were made to the underlying _toprepo_, symmetric with _regular commits_ for the constituent _submodules_ -that are pushed to the _submodules_' remote git servers. +that are pushed to the _submodules_' remote git _repository_. **monorepo**: In the code we use "_monorepo_" as short-hand notation instead of "_emulated monorepo_". As the code has no use in a "_pure monorepo_" context. From 51e1d5c6ad58163999c930bf2e9f1fd98a8a1a78 Mon Sep 17 00:00:00 2001 From: Fredrik Medley Date: Mon, 22 Sep 2025 23:12:33 +0200 Subject: [PATCH 6/6] frme edits --- doc/terminology.md | 233 ++++++++++++++++++++++++++------------------- 1 file changed, 134 insertions(+), 99 deletions(-) diff --git a/doc/terminology.md b/doc/terminology.md index 94d3a64..fe2483b 100644 --- a/doc/terminology.md +++ b/doc/terminology.md @@ -1,8 +1,8 @@ # Terminology overview This describes the terms involved in using the `git-toprepo` tool -to _assemble_ an _emulated monorepo_ for a _toprepo_ and its _submodules_. -this _combines_ the history of all _repositories_. +to _expand_ the _submodules_ of a _toprepo_ into _git-toprepo emulated monorepo_. +This _combines_ the history of all the _repositories_. ## Terms @@ -10,25 +10,25 @@ this _combines_ the history of all _repositories_. a _repository_. May be local or on a remote server. **git submodule**: A core `git` concept, -a _submodule_ is a _repository_ with a child-parent relationship to another. +a _submodule_ is a _repository_ with a child-parent relationship to another _repository_. **regular submodule**: A core `git` concept, a regular _submodule_ that is entirely managed through `git-submodule` etc. -**assimilated submodule**: A `git-toprepo` concept, -a _submodule_ that has been _assembled_ into the _combined_ history in the _toprepo_. +**expanded submodule**: A `git-toprepo` concept, +a _submodule_ that has been _expanded_ into the _combined_ history in the _toprepo_. **superrepo**: Emergent from core git concepts, the parent _repository_ to a _submodule_. It may be a _submodule_ to another _superrepo_. **toprepo**: A regular _repository_ with special configuration and purpose. -It is meant to be used together with `git-toprepo` to _assemble_ itself and its _submodules_ -to an _emulated monorepo_. +It is meant to be used together with `git-toprepo` to _expand_ its _submodules_ +to a _git-toprepo emulated monorepo_. This is generally configured by the organization but the user may have her own configuration for personal preferences. -It can also be checked out with _regular submodules_: +A _toprepo_ can also be checked out with _regular submodules_: `git-submodule init --recursive` but it is not the preferred development workflow. @@ -36,16 +36,16 @@ There is generally only one such _repository_ so it is often described in definite form: "the _toprepo_". **git-toprepo**: The tool itself. -`git-toprepo` _assembles_ a _toprepo_ -and (a choice of) its _submodules_ -into an _emulated monorepo_. -Takes care to push _assimilated submodules_ to their remote server. +`git-toprepo` _expands_ (a choice of) _submodules_ of a _toprepo_ +into a _git-toprepo emulated monorepo_, +the git histories are _combined_. +It takes care of pushing _expanded submodules_ to their respective remote server. **monorepo**: A _repository_ with all the code, it does not typically have _submodules_. This makes it easy to make changes across different components with a regular `git` workflow, -generally without _submodule_ bumps and binary deliveries/integration +generally without _submodule bumps_ and binary deliveries/integration of first party code. Gives unparalleled reproducibility and understanding of the full product. @@ -56,117 +56,126 @@ There is just one _repository_ on the remote `git` server. This realizes the full value of a _monorepo_, but has no clear _access control_. -**emulated monorepo**: A client-side construct, _assembly_, +**git-toprepo emulated monorepo**: A client-side construct, that _emulates_ a _monorepo_ for a _toprepo_. -The developer sees a joint history of all _submodules_ and can create _monocommits_ +The developer sees a joint history of all _submodules_ and can create _mono commits_ that span multiple _submodules_ and push/fetch them with `git-toprepo`. The tool keeps track of the _assimilated submodules_ with their own remote git _repositories_. -As a performance optimization a _monorepo_ created by `git-toprepo` +As a performance optimization, an _emulated monorepo_ created by `git-toprepo` may still have _regular submodules_ though, -if the user does not want to _combine_ all _submodules_. +if the user does not want to _expand_ all _submodules_. **submodule access control**: One can easily apply access control to individual _submodules_ by restricting access to their git _repositories_. -Such access control is not possible for different directories in a _pure monorepo_. +Such access control is not possible for different directories in a _pure git monorepo_. **commit**: A core `git` concept. -**monocommit**: A `git-toprepo` concept, -a commit in the _emulated monorepo_. +**combined commit**: A `git-toprepo` concept, +a commit in the _git-toprepo emulated monorepo_. `git-toprepo` shines when a developer wants to make one change across two _submodules_ -and can track that as one _monocommit_ --- one _commit_ in the _emulated monorepo_ that consists of one _commit_ in each of the two _assimilated submodules_. +and can track that as one _combined commit_, +i.e. one _commit_ in the _emulated monorepo_ that consists of one _commit_ in each of the two _assimilated submodules_. Those are meant to be merged together through compatible CI systems that allow _shared gating_ between the constituent _repositories_. -**shared gating**: A CI system concept. -CI systems like `Gerrit` allows an organization to merge code to multiple _repositories_ -atomically if all tests passes. -This allows the shared gating of the constituent _submodules_. -So the merged history is always compatible with an _emulated monorepo_, -there are no race conditions between different _repository_ gates. -`Gerrit` uses [superproject subscription] for this. +**submodule bump**: A core `git` concept, +a change in the _super repository_ +of which _commit id_ is wanted for a specific _submodule_ path. +**shared gating**: A CI system concept. +CI systems like [`Zuul CI`] allows an organization to merge code to multiple _repositories_ +if all tests passes, atomically if the git server supports it. +By bumping the submodules accordingly, +e.g. by using [superproject subscription] in `Gerrit`, +the history of the constituent _repositories_ +can be _recombined_ to the same _combined history_ graph that was pushed. + +[`Zuul CI`]: https://zuulci.org/ [superproject subscription]: https://gerrit-review.googlesource.com/Documentation/user-submodules.html ### Verbs -**assemble**: `git-toprepo` _assembles_ a _toprepo_ and its _submodules_ into an _emulated monorepo_. +**combine**: `git-toprepo` _combines_ the history of one _toprepo_ and (a choice of) its _submodules_ +into an _emulated monorepo_ with a _combined_ history for code in the _toprepo_ itself and its _expanded submodules_. -**combine**: `git-toprepo` _combines_ the history of one _toprepo_ and (some of) its _submodules_ -into an _emulated monorepo_ with a _combined_ history for code in the _toprepo_ itself and its _assimilated submodules_. +**expand**: The content of the _submodules_ is expanded into an _emulated monorepo_. -**assimilate**: `git-toprepo` has _assimilated_ a _submodule_ into the _combined_ _emulated monorepo_ history. - -**expand**: The _toprepo_ has been expanded to an _emulated monorepo_. -This verb is not used often but avoids the mention of _submodules_. - +**integrate**: A _submodule_ is integrated into the _git-toprepo emulated monorepo_ +when the history is _combined_ and the content is (optionally) _expanded_. ### Technical details For power users and _repository_ maintainers there are a few overlapping concepts. -**toprepo**: The `git-config` namespace for select `git-toprepo` settings that are configured through `git`. +**git-config**: The `toprepo` namespace is used for the `git-toprepo` settings +that are configured through `git`. -**toprepo**: The `git` subcommand that runs `git-toprepo`. -`git` runs external subcommands like `git-` as `git ` -to make it easy to create custom tools for `git`. +**git toprepo**: Git looks for external executables to run subcommands. +Calling `git toprepo` makes `git` execute `git-toprepo`. ### Technical terms in the code -**topcommit**: Commits in the _toprepo_'s own remote git _repository_. -These are fetched in `git-toprepo fetch` -these are also formed when pushing new work with `git-toprepo push` -if changes were made to the underlying _toprepo_, -symmetric with _regular commits_ for the constituent _submodules_ -that are pushed to the _submodules_' remote git _repository_. +**top commit**: Commits in the _toprepo_, +the remote _repository_ that has been cloned. +These are fetched using `git-toprepo fetch` (or `git fetch`) and +formed when pushing new work with `git-toprepo push`, +if changes were made to the underlying _toprepo_. + +**mono repo**: In the code, "_monorepo_" is used as short-hand notation instead of +"_git-toprepo emulated monorepo_" or "_combined repo_". As the code has no use in +a "_pure monorepo_" context, the brevity is placed over preciseness of the +term within the code. -**monorepo**: In the code we use "_monorepo_" as short-hand notation instead of -"_emulated monorepo_". As the code has no use in a "_pure monorepo_" context. -So the brevity is placed over preciseness of the term within the code. +**mono commit**: In the code, "_mono commit_" is used as short-hand notation +instead of "_git-toprepo emulated monorepo commit_" or "_combined commit_", for +symmetry reasons. As the code has no use in a "_pure monorepo_" context, the +brevity is placed over preciseness of the term within the code. ## Examples -### Initialization: expand the toprepo to an emulated monorepo +### Initialization: Expand the toprepo into an emulated monorepo -The _toprepo_ can be initialized to an _emulated monorepo_ with `git-toprepo`. -The configuration of the _emulated monorepo_ +The _toprepo_ can be initialized to a _git-toprepo emulated monorepo_ +with `git-toprepo`. +The configuration for `git-toprepo` is often managed in the _toprepo_ itself and is already checked in. -Short-form initialization of the _emulated monorepo_. +Short-form initialization of a _git-toprepo emulated monorepo_. ``` $ git toprepo clone ssh://gerrit.example/toprepo.git emulated-monorepo $ cd emulated-monorepo -emulated-monorepo $ # This is an emulated monorepo. +emulated-monorepo $ # This is a git-toprepo emulated monorepo. ``` -However, the code can also be checked out with regular git _submodules_. +However, the code can also be checked out with regular git _submodules_ +to create the same directory structure. ``` $ git clone ssh://gerrit.example/toprepo.git $ cd toprepo toprepo $ git submodule init --recursive -toprepo $ # This is not an emulated monorepo. +toprepo $ # This is not a git-toprepo emulated monorepo. ``` -### Initialization: Some submodules are not assimilated +### Initialization: Some submodules are not expanded -Now imagine that the _toprepo_ has one _submodule_ with a long and weird history, +Imagine that the _toprepo_ has one _submodule_ with a long and weird history, it may be binary data that takes a lot of space and is not relevant to the developer. -Then it is often not _assimilated_ into the _emulated monorepo_. +Then it might be preferred to not _expanding_ it into the _combined repo_. -_emulated monorepo_: +_git-toprepo emulated monorepo_: ``` $ git toprepo clone ssh://gerrit.example/toprepo.git emulated-monorepo $ cd emulated-monorepo emulated-monorepo $ # This is an emulated monorepo. -monorepo $ git submodule status +emulated-monorepo $ git submodule status -4e04771fcf658500987d0be5a9a63f8e77d5e386 binary_data_module ``` -regular _repository_: +Regular _repository_: ``` $ git clone ssh://gerrit.example/toprepo.git $ cd toprepo @@ -177,42 +186,51 @@ toprepo $ git submodule status -661c1b2d568693e3b6b631ae66f6872b194674f1 source_code_module ``` -### Pushing: git-toprepo pushes assimilated submodules to their servers +### Pushing: git-toprepo pushes combined repositories to their respective servers `git-toprepo` shines when a developer wants to make one change across two _submodules_ -in one _topcommit_. +in one _top commit_. ``` emulated-monorepo $ # modify one/file and two/file -emulated-monorepo $ git add one/file two/file; git commit -emulated-monorepo $ git-toprepo push HEAD:refs/for/main +emulated-monorepo $ git add one/file two/file +emulated-monorepo $ git commit +emulated-monorepo $ git-toprepo push HEAD:refs/heads/main ``` This pushes the two paths inside the _emulated monorepo_ to their constituent -_repositories_ on the git server (gerrit.example/one.git and gerrit.example/two.git). +_repositories_ on the git server (`gerrit.example/one.git` and `gerrit.example/two.git`). The regular workflow with _submodules_, however, is more involved ``` toprepo $ # modify one/file and two/file -toprepo $ git -C one add file; git commit -toprepo $ git -C two add file; git commit -toprepo $ git -C one push HEAD:refs/for/main -toprepo $ git -C two push HEAD:refs/for/main -# Because you use Gerrit's superproject subscription (otherwise git-toprepo does not work), -# you would not need a toprepo commit: -# toprepo $ git add one two; git commit -# toprepo $ git push HEAD:refs/for/main +toprepo $ git -C one add file +toprepo $ git -C one commit +toprepo $ git -C one push HEAD:refs/heads/main +toprepo $ git -C two add file +toprepo $ git -C two commit +toprepo $ git -C two push HEAD:refs/heads/main +``` + +In both cases, the submodule pointers in the branch `main` in the _toprepo_ +need to be updated to point at the latest commits in the submodules. +This can be done using e.g. Gerrit's superproject subscription or manually. + +``` +toprepo $ git add one two +toprepo $ git commit +toprepo $ git push HEAD:refs/heads/main ``` > [!NOTE] -> Though committing inside _regular submodules_ in an _emulated monorepo_ is rare. -> If a _submodule_'s history is not relevant to _assimilate_ into the _combined_ history +> Though committing inside _regular submodules_ in a _git-toprepo emulated monorepo_ is rare, +> if a _submodule_'s history is not relevant in the _combined_ history > it is unlikely that developers need to modify the code and make changes. -### Rebasing: git-toprepo gives a shared history that is easy to work with +### Rebasing: git-toprepo gives a combined history that is easy to work with -With `git-toprepo`, rebasing _commits_ in any of the _assimilated submodules_ +With `git-toprepo`, rebasing _commits_ in any of the _expanded submodules_ is as easy as working in a single _repository_. ``` @@ -226,10 +244,9 @@ one needs to automate the workflow within individual _submodules_. ``` toprepo $ git fetch origin toprepo $ git rebase -i origin/main -toprepo $ submod_commit_hash="$(git ls-files --stage -- one | cut -d' ' -f2)" -toprepo $ git -C one rebase -i "$submod_commit_hash" -toprepo $ submod_commit_hash="$(git ls-files --stage -- two | cut -d' ' -f2)" -toprepo $ git -C two rebase -i "$submod_commit_hash" +toprepo $ git submodule foreach 'git -C "$sm_path" rebase -i origin/main' +toprepo $ # On error, run 'git -C rebase --continue' +toprepo $ # followed by the same git-submodule-foreach command again. ``` In the example, two _submodules_ does not look too bad at the face of it, @@ -239,29 +256,47 @@ which may have only occurred in one _submodule_, is not trivial. ### Pushing: Push all submodules of a toprepo -As an _emulated monorepo_:_ may not have _combined_ all _submodules_ into the history +As a _git-toprepo emulated monorepo_ may not have _combined_ all _submodules_ into the history some _submodules_ are left as _regular submodules_. -So to always push changes to all _submodules_ the following invocation is needed: +To always push changes to all _submodules_ the following invocation is needed: ``` -emulated-monorepo $ git-toprepo push HEAD:refs/for/main -emulated-monorepo $ git submodule for each push HEAD:refs/for/main +emulated-monorepo $ git-toprepo push HEAD:refs/heads/main +emulated-monorepo $ git submodule foreach git push origin HEAD:refs/heads/main ``` > [!NOTE] -> Recall that committing inside _regular submodules_ in an _emulated monorepo_ is rare. +> Recall that committing inside _regular submodules_ in a _git-toprepo emulated monorepo_ is rare. -### Combination algorithm: +## History combination algorithm -This briefly outlines the _combination_ algorithm -that creates the _shared history_ of the _emulated monorepo_ +This briefly outlines the algorithm +that creates the _combined history_ of the _git-toprepo emulated monorepo_, to further contextualize the pieces and their relationships. -#### Fetch a toprepo commit and create a monocommit +### Fetch a toprepo commit and create a mono commit + +`git-toprepo fetch` first fetches the _regular commits_ for the _toprepo_ itself +using (approximately) `git fetch origin +refs/heads/*:refs/namespaces/top/refs/remotes/origin/*`. + +The next phase is the load phase where for each submodule: + +1. All _top commits_ reachable from `refs/namespaces/top/refs/remotes/*` +are loaded to look for _submodules_ and what _commit ids_ are referenced. +1. All _regular commits_ reachable from `refs/namespaces//*` are loaded. +1. If any of the _commit ids_ requested by the _super repository_ was not found, +they are fetched using `git fetch +refs/heads/*:refs/namespaces//refs/remotes/origin/*`. +1. All _regular commits_ reachable from `refs/namespaces//*` are +checked for inner _submodules_ and what _commit ids_ are referenced. +1. Step 2 then follows recursively. + +When all reachable commits have been loaded, the _submodules_ within the _toprepo_ +are _expanded_ and the history _combined_. -`git-toprepo fetch` first fetches the _regular commit_ (_topcommit_) for the _toprepo_ itself -`git fetch ...`. -Then finds any _submodules_ that are bumped through Gerrit's _superproject subscription_ -and fetches their _regular commits_. -All the _regular commits_ in the _rootrepo_ and the _assimilated submodules_ -are _combined_ into one _monocommit_. +1. Iterate through all _top commits_ reachable from `refs/namespaces/top/refs/remotes/*` +and start processing from the initial orphan _commits_. +1. For each _regular commit_, look for _submodule bumps_ or changes in `.gitmodules`. +1. _Expand_ each _submodule bump_ by replacing the _submodule_ git-link, +that points out the _commit id_, with the corresponding tree content. +1. Transfer of parents of each _submodule commits_ into the _combined commit_, +by checking which _combined commits_ the parents were _expanded_ in.