-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
gh-135239: simpler use of mutex in hashlib & co #135267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that PyMutex_Lock
will release the GIL/detach the thread state anyway. You want _PyMutex_LockFlags(m, _Py_LOCK_DONT_DETACH)
if you're worried about GIL overhead.
…135239 # Conflicts: # Modules/_hashopenssl.c # Modules/blake2module.c
bb237c2
to
db57278
Compare
…135239 # Conflicts: # Modules/md5module.c
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
The benchmarks on an LTO+PGO build (it's actually slower on a DEBUG build but that's because of all the additional assertions) [with OpenSSL 3.5]: ### Large update (chunks size = 2049)
+-----------------------------------+---------+-----------------------+
| Benchmark | ref | new |
+===================================+=========+=======================+
| _hashlib.openssl_sha1.update | 12.4 us | 12.2 us: 1.02x faster |
+-----------------------------------+---------+-----------------------+
| _hashlib.openssl_sha1.digest | 126 ns | 123 ns: 1.02x faster |
+-----------------------------------+---------+-----------------------+
| _hashlib.openssl_sha224.update | 13.7 us | 13.4 us: 1.02x faster |
+-----------------------------------+---------+-----------------------+
| _hashlib.openssl_sha3_224.digest | 293 ns | 284 ns: 1.03x faster |
+-----------------------------------+---------+-----------------------+
| _hashlib.openssl_sha3_384.update | 62.3 us | 59.0 us: 1.06x faster |
+-----------------------------------+---------+-----------------------+
| _hashlib.openssl_sha3_512.update | 88.1 us | 85.0 us: 1.04x faster |
+-----------------------------------+---------+-----------------------+
| _hashlib.openssl_shake_128.update | 38.7 us | 37.1 us: 1.04x faster |
+-----------------------------------+---------+-----------------------+
| _hashlib.openssl_shake_256.update | 46.6 us | 45.8 us: 1.02x faster |
+-----------------------------------+---------+-----------------------+
| _sha1.sha1.update | 61.1 us | 59.8 us: 1.02x faster |
+-----------------------------------+---------+-----------------------+
| _sha2.sha224.update | 69.9 us | 66.3 us: 1.05x faster |
+-----------------------------------+---------+-----------------------+
| _sha2.sha224.digest | 273 ns | 266 ns: 1.03x faster |
+-----------------------------------+---------+-----------------------+
| _sha2.sha256.update | 71.0 us | 65.7 us: 1.08x faster |
+-----------------------------------+---------+-----------------------+
| _sha2.sha256.digest | 277 ns | 266 ns: 1.04x faster |
+-----------------------------------+---------+-----------------------+
| _sha2.sha384.digest | 373 ns | 354 ns: 1.05x faster |
+-----------------------------------+---------+-----------------------+
| _sha3.sha3_224.update | 56.9 us | 53.4 us: 1.06x faster |
+-----------------------------------+---------+-----------------------+
| _sha3.sha3_224.digest | 258 ns | 250 ns: 1.03x faster |
+-----------------------------------+---------+-----------------------+
| _sha3.sha3_256.update | 60.4 us | 56.5 us: 1.07x faster |
+-----------------------------------+---------+-----------------------+
| _sha3.sha3_384.update | 78.5 us | 74.3 us: 1.06x faster |
+-----------------------------------+---------+-----------------------+
| _sha3.sha3_384.digest | 260 ns | 252 ns: 1.03x faster |
+-----------------------------------+---------+-----------------------+
| _sha3.sha3_512.update | 112 us | 107 us: 1.04x faster |
+-----------------------------------+---------+-----------------------+
| _sha3.shake_128.update | 49.8 us | 46.3 us: 1.08x faster |
+-----------------------------------+---------+-----------------------+
| _sha3.shake_256.update | 59.7 us | 56.9 us: 1.05x faster |
+-----------------------------------+---------+-----------------------+
| _blake2.blake2s.update | 37.0 us | 36.2 us: 1.02x faster |
+-----------------------------------+---------+-----------------------+
| Geometric mean | (ref) | 1.02x faster |
+-----------------------------------+---------+-----------------------+ ### Small update (chunks size = 128)
+----------------------------------------+-----------+-----------------------+
| Benchmark | ref-small | new-small |
+========================================+===========+=======================+
| _hashlib.openssl_md5.update | 1.94 us | 1.85 us: 1.05x faster |
+----------------------------------------+-----------+-----------------------+
| _hashlib.openssl_sha1.update | 961 ns | 943 ns: 1.02x faster |
+----------------------------------------+-----------+-----------------------+
| _hashlib.openssl_sha1.digest | 114 ns | 124 ns: 1.09x slower |
+----------------------------------------+-----------+-----------------------+
| _hashlib.openssl_sha224.digest | 113 ns | 118 ns: 1.04x slower |
+----------------------------------------+-----------+-----------------------+
| _hashlib.openssl_sha3_224.digest | 277 ns | 284 ns: 1.03x slower |
+----------------------------------------+-----------+-----------------------+
| _hashlib.openssl_sha3_256.update | 2.96 us | 3.20 us: 1.08x slower |
+----------------------------------------+-----------+-----------------------+
| _hashlib.openssl_sha3_384.update | 4.05 us | 3.86 us: 1.05x faster |
+----------------------------------------+-----------+-----------------------+
| _hashlib.openssl_shake_128.update | 2.56 us | 2.62 us: 1.02x slower |
+----------------------------------------+-----------+-----------------------+
| _hashlib.openssl_shake_256.update | 3.02 us | 3.12 us: 1.03x slower |
+----------------------------------------+-----------+-----------------------+
| _hashlib.openssl_shake_256.digest[512] | 804 ns | 826 ns: 1.03x slower |
+----------------------------------------+-----------+-----------------------+
| _sha2.sha224.update | 4.25 us | 4.34 us: 1.02x slower |
+----------------------------------------+-----------+-----------------------+
| _sha2.sha224.digest | 271 ns | 264 ns: 1.02x faster |
+----------------------------------------+-----------+-----------------------+
| _sha2.sha512.update | 3.00 us | 3.04 us: 1.02x slower |
+----------------------------------------+-----------+-----------------------+
| _sha2.sha512.digest | 355 ns | 372 ns: 1.05x slower |
+----------------------------------------+-----------+-----------------------+
| _sha3.sha3_224.update | 3.92 us | 3.61 us: 1.08x faster |
+----------------------------------------+-----------+-----------------------+
| _sha3.sha3_256.digest | 253 ns | 262 ns: 1.04x slower |
+----------------------------------------+-----------+-----------------------+
| _sha3.sha3_384.digest | 253 ns | 259 ns: 1.03x slower |
+----------------------------------------+-----------+-----------------------+
| _sha3.sha3_512.update | 6.78 us | 7.77 us: 1.15x slower |
+----------------------------------------+-----------+-----------------------+
| _sha3.sha3_512.digest | 251 ns | 256 ns: 1.02x slower |
+----------------------------------------+-----------+-----------------------+
| _sha3.shake_128.update | 3.12 us | 3.22 us: 1.03x slower |
+----------------------------------------+-----------+-----------------------+
| _sha3.shake_256.update | 3.76 us | 4.03 us: 1.07x slower |
+----------------------------------------+-----------+-----------------------+
| _blake2.blake2s.update | 2.45 us | 2.39 us: 1.03x faster |
+----------------------------------------+-----------+-----------------------+
| Geometric mean | (ref) | 1.01x slower |
+----------------------------------------+-----------+-----------------------+ Benchmark scriptimport importlib
import os
import pyperf
def update(hasher, blocks):
for block in blocks:
hasher.update(block)
def bench_update(runner, args, typename, module, function):
namespace = importlib.import_module(module)
hasher = getattr(namespace, function)()
blocks = [os.urandom(args.chunk_size) for _ in range(args.chunk_loop)]
runner.bench_func(f"{typename}.update", update, hasher, blocks)
def bench_digest(runner, args, typename, module, function, *, xof=False):
namespace = importlib.import_module(module)
hasher = getattr(namespace, function)(os.urandom(1024))
if xof:
n = args.xof_length
runner.bench_func(f"{typename}.digest[{n}]", hasher.digest, n)
else:
runner.bench_func(f"{typename}.digest", hasher.digest)
def bench_hash_function(runner, args, module, function, *, xof=False):
typename = f"{module}.{function}"
bench_update(runner, args, typename, module, function)
bench_digest(runner, args, typename, module, function, xof=xof)
def bench_openssl_functions(runner, args, functions, *, xof=False):
for function in functions:
function = "openssl_" + function.removeprefix("openssl_")
bench_hash_function(runner, args, "_hashlib", function, xof=xof)
if __name__ == "__main__":
runner = pyperf.Runner()
runner.argparser.add_argument("--chunk-loop", type=int, default=16)
runner.argparser.add_argument("--chunk-size", type=int, default=2049)
runner.argparser.add_argument("--xof-length", type=int, default=512)
args = runner.parse_args()
bench_openssl_functions(
runner, args, [
"md5",
"sha1",
"sha224", "sha256", "sha384", "sha512",
"sha3_224", "sha3_256", "sha3_384", "sha3_512",
], xof=False,
) # fmt: skip
bench_openssl_functions(runner, args, ["shake_128", "shake_256"], xof=True)
for module, functions, xof in [
("_md5", ["md5"], False),
("_sha1", ["sha1"], False),
("_sha2", ["sha224", "sha256", "sha384", "sha512"], False),
("_sha3", ["sha3_224", "sha3_256", "sha3_384", "sha3_512"], False),
("_sha3", ["shake_128", "shake_256"], True),
("_blake2", ["blake2b", "blake2s"], False),
]: # fmt: skip
for function in functions:
bench_hash_function(runner, args, module, function, xof=xof) |
@ZeroIntensity can you take a look at this PR now? TiA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The locking looks good to me.
Arf, I should have actually run the build bots first. |
I've taken the liberty of normalizing code style. I'll do the same in other modules, (SHA and BLAKE2). That way, I'll never need to touch cosmetics again in crypto-modules. Well, if it's too much I can drop the last commit.