Upgrade link-git to use latest version of git-ref#792
Conversation
`git-ref` was was improved such that users aren't forced anymore to handle packed-refs thanks to an auto-updated shared cache. As it's possible to still access the previous (but now renamed) versions of methods that use packed-refs as parameter. Along with direct access to the latest cached packed-refs (lazy loaded, auto-reload on modification), one can get snapshot-like access by reusing the same packed-refs across multiple calls. All this is in an effort to get closer to a unified API that can work similarly in a ref-table implementation. There is also a 'general' store in the works which can be either loose refs or a ref-table, but that one seems only relevant once ref-table is actually implemented. Signed-off-by: Sebastian Thiel <[email protected]>
It can have additional suffixes if it's compiled with some like here: https://github.com/git/git/blob/seen/help.c#L689:L689 But most notably, the default apple binary claims something like `git version 2.30.1 (Apple Git-130)` What's similar is `git version <version>`, which is now relied upon by the parser. Signed-off-by: Sebastian Thiel <[email protected]>
|
Maybe you can advise on the build failures on windows as these seem unrelated. Also now linux (which worked previously) fails like macos with an error about the replication user missing. Please note that locally I see a couple of tests failing which CI on MacOS doesn't reproduce. These failed before my edits as well. I am saying this in case these are legit and can be fixed, but I wouldn't know where to start. Here is the actual error of 2/3 of them: The smoke test also doesn't get data apparently but fails due to missing branches. Maybe it's all related to the |
Signed-off-by: Sebastian Thiel <[email protected]>
|
|
In case it matters, the racy code isn't used at all as the internal cache isn't used in the called methods. These are the originals, but renamed. Unfortunately I don't understand in which way it's racy, so would be glad if you had a hint for me so it can be fixed. From a locking perspective I don't see why it's racy. Thanks a lot. |
|
You need to take a file lock on packed-refs to guarantee atomicity.
|
To my knowledge, new packed-refs files are only ever moved into place (from the .lock file, which is written, over the original file). Moves are considered atomic which is why the file is not half-written when reading it. Is this the raciness you refer to? If not I can imagine that the file could indeed change between the stat() call and the actual read/mmap, but git doesn't seem to care either because no file lock is acquired for reading (even though they do save the stat() call if they have lock right now). What do you think? |
|
the file could indeed change between the stat() call and the actual read/mmap
Exactly. We don't want to track the timestamp of the old file, or do we?
Either way, I don't think this is a good patch because we're already tracking
packed-refs on a different layer. So what we get here is a no-op containing a
lot of middle-digit version bumps, for which we would now need to review the
changelogs. If there's no actual reason to upgrade, I'd prefer to keep things
stable for a while.
|
Thanks for the clarification, I will be closing this PR and try the mailing list - it's my first time 😅. I note that there should be no updates in future unless there add something substantial, like
That's interesting. To my mind putting down a lock for reading in this situation will reduce concurrency as it blocks (or cancels) writers, while reader will still have made the correct decision. The file change since it was last loaded, and even if it changes again after deciding it should be reloaded (or is deleted entirely), the result will be 'more recent' without blocking writers which seems desirable. This is also why git doesn't put down a lock before reading a loose ref but relies entirely on atomic file moves. But I digress, see you on the ml. |
|
Say I could be convinced that the timestamp doesn't matter: I notice you're
taking an upgradeable read lock to check the mtime.
The calling thread will be blocked until there are no more writers or other
upgradable reads which hold the lock.
Why is this ok?
|
The way I understand this, an Is this what you are implying, or is there something else I don't see? |
|
I don’t think your understanding is correct. From https://docs.rs/lock_api/0.4.5/lock_api/trait.RawRwLockUpgrade.html:
Also: https://docs.rs/parking_lot/0.11.2/src/parking_lot/raw_rwlock.rs.html#568 So, your code behaves as if every access is going through a mutex. |
I agree 😅. The 'fearless concurrency' part of Rust doesn't necessarily extend into concurrency primitives, which do still require a good amount of testing just to be sure the docs aren't read with Rust-goggles on. With that in mind I pulled out the docs for the methods that are actually called,
…and
…and none talk about deadlocks. Admittedly, I don't claim to understand the depth of the plumbing documentation and code provided and have no doubt that deadlocks may happen, but they are not mentioned anymore in the porcelain API that Putting it to the test, I have adjusted the Takeaways
Actions
Thank you for bearing with me on this one, thanks to you this sync-related performance issue won't get far. |
…cks (#263) Here is an explanation for this: radicle-dev/radicle-link#792 (comment) It turns out the upgradable locks are only great if there is heaviest contention, as they can be a little faster.
|
Ya, "concurrency without fear" is a pick two statement when it comes to shared data :) I have punted on this myself, but I think you could do even better: The assumption is that the majority of reads will find So, the way to punt is to leave it to the caller to decide for how long they would be able to tolerate inconsistent reads. That's what
Iow, performing |
It's a lovely abstraction, and it's something people can emulate by obtaining a cached packed-refs buffer (or open a new one) to hand in as parameter to these This is a bit of a new direction for
I agree, and believe that there is no way around ArcSwap/Atomic (or basically the techniques seen in
Yes, agreed, I will put in the additional stat() check in as the current implementation would do extra work which would definitely cost more in highly concurrent scenarios. |
git-refwas was improved such that users aren't forced anymore tohandle packed-refs thanks to an auto-updated shared cache.
As it's possible to still access the previous (but now renamed)
versions of methods that use packed-refs as parameter. Along with
direct access to the latest cached packed-refs (lazy loaded,
auto-reload on modification), one can get snapshot-like access
by reusing the same packed-refs across multiple calls.
All this is in an effort to get closer to a unified API that
can work similarly in a ref-table implementation.
There is also a 'general' store in the works which can be either
loose refs or a ref-table, but that one seems only relevant once
ref-table is actually implemented.
Signed-off-by: Sebastian Thiel [email protected]