BUG: Avoid heap buffer overflow for stringdtype searchsorted#31535
Conversation
|
Is the test known to fail before this PR? |
|
Yeah I confirmed manually that it triggers an ASan heap overflow report. |
MaanasArora
left a comment
There was a problem hiding this comment.
Thanks, looks good to me! I think this is at the right level for a hack (just at the edge of npy_binsearch). And thanks for the ping, definitely willing to do the larger refactor for 2.6!
I wonder if the test can be made repetitive enough to fail more reliably, not sure if that would be too slow?
On an ASan build the test I added will fail without any iteration without the other C changes, so I'm not worried. I do want to add more searchsorted tests but that can come later. |
Yeah sounds good, so at least CI is good, and a bit niche anyway! |
|
Thanks Nathan. |
PR summary
Fixes #31533.
Currently the low-level searchsorted implementation is wired up to assume the needle and haystack arrays share a descriptor. This breaks that assumption and adds a hacky special-case in the searchsorted implementation.
I'm deferring a nicer, less hacky fix for later in the 2.6 development cycle (ping @MaanasArora if you're interested in taking that on).
AI Disclosure
I used Claude to help analyze the bug.