Codestin Search App

esarp · 2025-12-09T18:42:06Z

Implements bytes.startswith in mypy. Potentially could be more efficient without relying on memcmp but not sure.

Tested with the following benchmark code, which shows a ~6.3x performance improvement compared to standard Python:

import time

def bench(prefix: bytes, a: list[bytes], n: int) -> int:
    i = 0
    for x in range(n):
        for b in a:
            if b.startswith(prefix):
                i += 1
    return i


a = [b"foo", b"barasdfsf", b"foobar", b"ab", b"asrtert", b"sertyeryt"]
n = 5 * 1000 * 1000
prefix = b"foo"

bench(prefix, a, n)

t0 = time.time()
bench(prefix, a, n)
td = time.time() - t0
print(f"{td}s")

Output:

$ python /tmp/bench.py
1.0015509128570557s
$ python -c 'import bench'
0.154998779296875s

for more information, see https://pre-commit.ci

github-actions · 2025-12-09T19:07:36Z

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

BobTheBuidler · 2025-12-10T04:13:23Z

+        const char *self_buf = PyBytes_AS_STRING(self);
+        const char *subobj_buf = PyBytes_AS_STRING(subobj);
+
+        if (subobj_len == 0) {


maybe this if check should go above the 2 PyBytes_AS_STRING lines? We can exit without those calls if the check returns true

Good call, updated. I split the checks around each PyBytes_GET_SIZE call a bit further to optimize for the empty-arg case. Probably won't save a ton but I don't think it makes it unreadable

JukkaL · 2025-12-10T16:56:36Z

+    # Test empty cases
+    assert test.startswith(b'')
+    assert b''.startswith(b'')
+    assert not b''.startswith(test)


Test with bytearray 1) as the receiver object and 2) the argument. This way we will also test the slow path.

Added a few checks to cover those as well

Rounding out #20387 and implementing `bytes.endswith`. Simple benchmark shows a ~6.4x improvement. Tested with the following benchmark code: ``` import time def bench(suffix: bytes, a: list[bytes], n: int) -> int: i = 0 for x in range(n): for b in a: if b.endswith(suffix): i += 1 return i a = [b"foo", b"barasdfsf", b"foobar", b"ab", b"asrtert", b"sertyeryt"] n = 5 * 1000 * 1000 suffix = b"foo" bench(suffix, a, n) t0 = time.time() bench(suffix, a, n) td = time.time() - t0 print(f"{td}s") ``` Output: ``` $ python bench.py 0.9002199172973633s $ python -c "import bench" 0.13828086853027344s ```

Rounding out python#20387 and implementing `bytes.endswith`. Simple benchmark shows a ~6.4x improvement. Tested with the following benchmark code: ``` import time def bench(suffix: bytes, a: list[bytes], n: int) -> int: i = 0 for x in range(n): for b in a: if b.endswith(suffix): i += 1 return i a = [b"foo", b"barasdfsf", b"foobar", b"ab", b"asrtert", b"sertyeryt"] n = 5 * 1000 * 1000 suffix = b"foo" bench(suffix, a, n) t0 = time.time() bench(suffix, a, n) td = time.time() - t0 print(f"{td}s") ``` Output: ``` $ python bench.py 0.9002199172973633s $ python -c "import bench" 0.13828086853027344s ```

esarp force-pushed the mypycBytesStartswith branch from 72f9ae5 to 2064d9e Compare December 9, 2025 18:45

Implement bytes.startswith in mypyc

1185d2f

esarp force-pushed the mypycBytesStartswith branch from e8c65fe to 1185d2f Compare December 9, 2025 18:49

[pre-commit.ci] auto fixes from pre-commit.com hooks

5dbd6a9

for more information, see https://pre-commit.ci

BobTheBuidler reviewed Dec 10, 2025

View reviewed changes

JukkaL reviewed Dec 10, 2025

View reviewed changes

Optimize checks; add bytearray tests

69fbc68

JukkaL approved these changes Dec 11, 2025

View reviewed changes

JukkaL merged commit 1cea058 into python:master Dec 11, 2025
16 checks passed

RiyaPrabhakar-git added this to GC-Content-Calculator Dec 12, 2025

github-project-automation Bot moved this to Done in GC-Content-Calculator Dec 12, 2025

esarp mentioned this pull request Dec 19, 2025

[mypyc] Implement bytes.endswith #20447

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement bytes.startswith in mypyc#20387

Implement bytes.startswith in mypyc#20387
JukkaL merged 3 commits intopython:masterfrom
esarp:mypycBytesStartswith

esarp commented Dec 9, 2025

Uh oh!

github-actions Bot commented Dec 9, 2025

Uh oh!

BobTheBuidler Dec 10, 2025

Uh oh!

esarp Dec 10, 2025

Uh oh!

JukkaL Dec 10, 2025

Uh oh!

esarp Dec 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

esarp commented Dec 9, 2025

Uh oh!

github-actions Bot commented Dec 9, 2025

Uh oh!

BobTheBuidler Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

esarp Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

JukkaL Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

esarp Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants