Thanks to visit codestin.com
Credit goes to github.com

Skip to content

gh-135239: simpler use of mutex in hashlib & co #135267

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 45 commits into from
Jun 22, 2025

Conversation

picnixz
Copy link
Member

@picnixz picnixz commented Jun 8, 2025

I've taken the liberty of normalizing code style. I'll do the same in other modules, (SHA and BLAKE2). That way, I'll never need to touch cosmetics again in crypto-modules. Well, if it's too much I can drop the last commit.

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that PyMutex_Lock will release the GIL/detach the thread state anyway. You want _PyMutex_LockFlags(m, _Py_LOCK_DONT_DETACH) if you're worried about GIL overhead.

@picnixz picnixz marked this pull request as draft June 8, 2025 19:37
@picnixz picnixz requested a review from ZeroIntensity June 15, 2025 09:40
@picnixz picnixz marked this pull request as ready for review June 15, 2025 11:39
@picnixz picnixz marked this pull request as draft June 16, 2025 10:16
@picnixz picnixz force-pushed the perf/hashlib/mutex-135239 branch from bb237c2 to db57278 Compare June 16, 2025 10:35
@picnixz picnixz marked this pull request as draft June 21, 2025 12:36
@picnixz

This comment was marked as resolved.

@picnixz picnixz marked this pull request as ready for review June 21, 2025 13:46
@picnixz

This comment was marked as resolved.

@picnixz

This comment was marked as resolved.

@bedevere-bot

This comment was marked as resolved.

@picnixz

This comment was marked as resolved.

@bedevere-bot

This comment was marked as resolved.

@picnixz

This comment was marked as resolved.

@picnixz
Copy link
Member Author

picnixz commented Jun 22, 2025

The benchmarks on an LTO+PGO build (it's actually slower on a DEBUG build but that's because of all the additional assertions) [with OpenSSL 3.5]:

### Large update (chunks size = 2049)

+-----------------------------------+---------+-----------------------+
| Benchmark                         | ref     | new                   |
+===================================+=========+=======================+
| _hashlib.openssl_sha1.update      | 12.4 us | 12.2 us: 1.02x faster |
+-----------------------------------+---------+-----------------------+
| _hashlib.openssl_sha1.digest      | 126 ns  | 123 ns: 1.02x faster  |
+-----------------------------------+---------+-----------------------+
| _hashlib.openssl_sha224.update    | 13.7 us | 13.4 us: 1.02x faster |
+-----------------------------------+---------+-----------------------+
| _hashlib.openssl_sha3_224.digest  | 293 ns  | 284 ns: 1.03x faster  |
+-----------------------------------+---------+-----------------------+
| _hashlib.openssl_sha3_384.update  | 62.3 us | 59.0 us: 1.06x faster |
+-----------------------------------+---------+-----------------------+
| _hashlib.openssl_sha3_512.update  | 88.1 us | 85.0 us: 1.04x faster |
+-----------------------------------+---------+-----------------------+
| _hashlib.openssl_shake_128.update | 38.7 us | 37.1 us: 1.04x faster |
+-----------------------------------+---------+-----------------------+
| _hashlib.openssl_shake_256.update | 46.6 us | 45.8 us: 1.02x faster |
+-----------------------------------+---------+-----------------------+
| _sha1.sha1.update                 | 61.1 us | 59.8 us: 1.02x faster |
+-----------------------------------+---------+-----------------------+
| _sha2.sha224.update               | 69.9 us | 66.3 us: 1.05x faster |
+-----------------------------------+---------+-----------------------+
| _sha2.sha224.digest               | 273 ns  | 266 ns: 1.03x faster  |
+-----------------------------------+---------+-----------------------+
| _sha2.sha256.update               | 71.0 us | 65.7 us: 1.08x faster |
+-----------------------------------+---------+-----------------------+
| _sha2.sha256.digest               | 277 ns  | 266 ns: 1.04x faster  |
+-----------------------------------+---------+-----------------------+
| _sha2.sha384.digest               | 373 ns  | 354 ns: 1.05x faster  |
+-----------------------------------+---------+-----------------------+
| _sha3.sha3_224.update             | 56.9 us | 53.4 us: 1.06x faster |
+-----------------------------------+---------+-----------------------+
| _sha3.sha3_224.digest             | 258 ns  | 250 ns: 1.03x faster  |
+-----------------------------------+---------+-----------------------+
| _sha3.sha3_256.update             | 60.4 us | 56.5 us: 1.07x faster |
+-----------------------------------+---------+-----------------------+
| _sha3.sha3_384.update             | 78.5 us | 74.3 us: 1.06x faster |
+-----------------------------------+---------+-----------------------+
| _sha3.sha3_384.digest             | 260 ns  | 252 ns: 1.03x faster  |
+-----------------------------------+---------+-----------------------+
| _sha3.sha3_512.update             | 112 us  | 107 us: 1.04x faster  |
+-----------------------------------+---------+-----------------------+
| _sha3.shake_128.update            | 49.8 us | 46.3 us: 1.08x faster |
+-----------------------------------+---------+-----------------------+
| _sha3.shake_256.update            | 59.7 us | 56.9 us: 1.05x faster |
+-----------------------------------+---------+-----------------------+
| _blake2.blake2s.update            | 37.0 us | 36.2 us: 1.02x faster |
+-----------------------------------+---------+-----------------------+
| Geometric mean                    | (ref)   | 1.02x faster          |
+-----------------------------------+---------+-----------------------+
### Small update (chunks size = 128)

+----------------------------------------+-----------+-----------------------+
| Benchmark                              | ref-small | new-small             |
+========================================+===========+=======================+
| _hashlib.openssl_md5.update            | 1.94 us   | 1.85 us: 1.05x faster |
+----------------------------------------+-----------+-----------------------+
| _hashlib.openssl_sha1.update           | 961 ns    | 943 ns: 1.02x faster  |
+----------------------------------------+-----------+-----------------------+
| _hashlib.openssl_sha1.digest           | 114 ns    | 124 ns: 1.09x slower  |
+----------------------------------------+-----------+-----------------------+
| _hashlib.openssl_sha224.digest         | 113 ns    | 118 ns: 1.04x slower  |
+----------------------------------------+-----------+-----------------------+
| _hashlib.openssl_sha3_224.digest       | 277 ns    | 284 ns: 1.03x slower  |
+----------------------------------------+-----------+-----------------------+
| _hashlib.openssl_sha3_256.update       | 2.96 us   | 3.20 us: 1.08x slower |
+----------------------------------------+-----------+-----------------------+
| _hashlib.openssl_sha3_384.update       | 4.05 us   | 3.86 us: 1.05x faster |
+----------------------------------------+-----------+-----------------------+
| _hashlib.openssl_shake_128.update      | 2.56 us   | 2.62 us: 1.02x slower |
+----------------------------------------+-----------+-----------------------+
| _hashlib.openssl_shake_256.update      | 3.02 us   | 3.12 us: 1.03x slower |
+----------------------------------------+-----------+-----------------------+
| _hashlib.openssl_shake_256.digest[512] | 804 ns    | 826 ns: 1.03x slower  |
+----------------------------------------+-----------+-----------------------+
| _sha2.sha224.update                    | 4.25 us   | 4.34 us: 1.02x slower |
+----------------------------------------+-----------+-----------------------+
| _sha2.sha224.digest                    | 271 ns    | 264 ns: 1.02x faster  |
+----------------------------------------+-----------+-----------------------+
| _sha2.sha512.update                    | 3.00 us   | 3.04 us: 1.02x slower |
+----------------------------------------+-----------+-----------------------+
| _sha2.sha512.digest                    | 355 ns    | 372 ns: 1.05x slower  |
+----------------------------------------+-----------+-----------------------+
| _sha3.sha3_224.update                  | 3.92 us   | 3.61 us: 1.08x faster |
+----------------------------------------+-----------+-----------------------+
| _sha3.sha3_256.digest                  | 253 ns    | 262 ns: 1.04x slower  |
+----------------------------------------+-----------+-----------------------+
| _sha3.sha3_384.digest                  | 253 ns    | 259 ns: 1.03x slower  |
+----------------------------------------+-----------+-----------------------+
| _sha3.sha3_512.update                  | 6.78 us   | 7.77 us: 1.15x slower |
+----------------------------------------+-----------+-----------------------+
| _sha3.sha3_512.digest                  | 251 ns    | 256 ns: 1.02x slower  |
+----------------------------------------+-----------+-----------------------+
| _sha3.shake_128.update                 | 3.12 us   | 3.22 us: 1.03x slower |
+----------------------------------------+-----------+-----------------------+
| _sha3.shake_256.update                 | 3.76 us   | 4.03 us: 1.07x slower |
+----------------------------------------+-----------+-----------------------+
| _blake2.blake2s.update                 | 2.45 us   | 2.39 us: 1.03x faster |
+----------------------------------------+-----------+-----------------------+
| Geometric mean                         | (ref)     | 1.01x slower          |
+----------------------------------------+-----------+-----------------------+
Benchmark script
import importlib
import os

import pyperf

def update(hasher, blocks):
    for block in blocks:
        hasher.update(block)

def bench_update(runner, args, typename, module, function):
    namespace = importlib.import_module(module)
    hasher = getattr(namespace, function)()
    blocks = [os.urandom(args.chunk_size) for _ in range(args.chunk_loop)]
    runner.bench_func(f"{typename}.update", update, hasher, blocks)

def bench_digest(runner, args, typename, module, function, *, xof=False):
    namespace = importlib.import_module(module)
    hasher = getattr(namespace, function)(os.urandom(1024))
    if xof:
        n = args.xof_length
        runner.bench_func(f"{typename}.digest[{n}]", hasher.digest, n)
    else:
        runner.bench_func(f"{typename}.digest", hasher.digest)

def bench_hash_function(runner, args, module, function, *, xof=False):
    typename = f"{module}.{function}"
    bench_update(runner, args, typename, module, function)
    bench_digest(runner, args, typename, module, function, xof=xof)

def bench_openssl_functions(runner, args, functions, *, xof=False):
    for function in functions:
        function = "openssl_" + function.removeprefix("openssl_")
        bench_hash_function(runner, args, "_hashlib", function, xof=xof)

if __name__ == "__main__":
    runner = pyperf.Runner()
    runner.argparser.add_argument("--chunk-loop", type=int, default=16)
    runner.argparser.add_argument("--chunk-size", type=int, default=2049)
    runner.argparser.add_argument("--xof-length", type=int, default=512)
    args = runner.parse_args()

    bench_openssl_functions(
        runner, args, [
            "md5",
            "sha1",
            "sha224", "sha256", "sha384", "sha512",
            "sha3_224", "sha3_256", "sha3_384", "sha3_512",
        ], xof=False,
    )  # fmt: skip
    bench_openssl_functions(runner, args, ["shake_128", "shake_256"], xof=True)

    for module, functions, xof in [
        ("_md5", ["md5"], False),
        ("_sha1", ["sha1"], False),
        ("_sha2", ["sha224", "sha256", "sha384", "sha512"], False),
        ("_sha3", ["sha3_224", "sha3_256", "sha3_384", "sha3_512"], False),
        ("_sha3", ["shake_128", "shake_256"], True),
        ("_blake2", ["blake2b", "blake2s"], False),
    ]:  # fmt: skip
        for function in functions:
            bench_hash_function(runner, args, module, function, xof=xof)

@picnixz
Copy link
Member Author

picnixz commented Jun 22, 2025

@ZeroIntensity can you take a look at this PR now? TiA

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The locking looks good to me.

@picnixz picnixz merged commit e7295a8 into python:main Jun 22, 2025
43 checks passed
@picnixz picnixz deleted the perf/hashlib/mutex-135239 branch June 22, 2025 15:00
@picnixz
Copy link
Member Author

picnixz commented Jun 22, 2025

Arf, I should have actually run the build bots first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
skip news type-refactor Code refactoring (with no changes in behavior)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants