-
Notifications
You must be signed in to change notification settings - Fork 594
fix trim() handling of TARG (and minor refactor) #22788
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Should I'm wondering if the likes of this could happen:
|
I think you're right, though I didn't come up with a test case off hand. And of course sv_setpvn() would fix it. |
Hmm, I couldn't come up with a test case either. Quickly looking at Perl_sv_setsv_flags, maybe it will always either swipe the buffer or copy it, never COW? Even if so, may be worth guarding against that logic ever changing. |
I don't believe there's any reason for |
builtin.c
Outdated
dest = TARG; | ||
SV_CHECK_THINKFIRST(TARG); | ||
SvUPGRADE(TARG, SVt_PV); | ||
SvGROW(TARG, len + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is grow required here? Fairly sure that trim can't make an SV bigger, only smaller, so surely it must already have enough storage for the new size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TARG and source are not the same SV.
If this were a pp func and implemented TARGLEX, then we'd retain the branch I removed and do some special handling for that.
But it's XS, so TARG
shouldn't be source
(though source might be the TARG from another entersub calling trim, or from the same entersub in recursion, in either case it's not the TARG here.
I tried adding an assert(0) to the original "same SV" handling code and it never triggered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohright of course, this isn't an in-place trim, it's a substr copy.
builtin.c
Outdated
SvPVX(dest)[len] = '\0'; | ||
SvPOK_on(dest); | ||
SvCUR_set(dest, len); | ||
Copy(start, SvPVX(TARG), len, U8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is start
always still valid after the THINKFIRST? Is it possible that could copy out a CoW string to a new buffer?
Oh but then I suppose start
will still point at the original untouched buffer so it's probably all fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh but then I suppose
start
will still point at the original untouched buffer so it's probably all fine.
That's what I expect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About 4539fb8
Not saying the fix in blead/this ticket is wrong, but there might a better way to do it, or in general for XS "do it".
Note the old code removed in this commit used the VERY underused (core/cpan) SvOOK flag optimization. The new code does not have sv_chop()
optimization to prevent double buffering and maybe a trip through re/malloc()
.
The 5.25 newish sv_set_undef()
to get a no-leak SvOK_off() then sv_chop(), or SvCUR() lowering, which is changing 1 or 2 integers only, vs Move()/memcpy(), is the better implementation for white space trimming a string. Perl has the API to do it efficiently (string manipulation), SvOOK needs to be used more often by the community.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note the old code removed in this commit used the VERY underused (core/cpan) SvOOK flag optimization. The new code does not have sv_chop() optimization to prevent double buffering and maybe a trip through re/malloc().
The sv_chop() branch was removed because the input and output SVs are never the same SV.
The 5.25 newish sv_set_undef() ...
I have a clang-tidy check for that (though I believe loadable clang-tidy checks don't work on Windows).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you discreetly slip that magic potion into EU::ParseXS when nobody is looking??!?! 🥺
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect most of the API use comes from the typemap
.
It can be run on the generated XS code, in pretty much the same way you use it on the perl sources, though it may point you at APIs that aren't covered by ppport.h
and so won't work with older perls.
c928ce3
to
a801bbc
Compare
I expect to squash this before merging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Fixes a whole lot of issues, and simplifies much of the logic.
refactor a lot of custom "set SV to a string code" away to sv_setpvn(), this: - fixed the original problem reported for Perl#22784, where TARG wasn't being reset properly and contained a cached numeric version of the result from the previous call. - removed some never executed code, since builtin::trim is only XS and is not an OP with the TARGLEX optimization - fixes a possible problem if the result of the first call to trim() is COWed. This does slightly change the taint behaviour, rather than making TARG tainted iff source is tainted, it changes to the behaviour of the rest of perl, making TARG tainted if any tainted input is seen in the current expression. See thr PR Perl#22788 for some discussion on how we got here. Fixes Perl#22784
a801bbc
to
35e27db
Compare
Fix mishandling of re-use of TARG in trim.
Fixes #22784
Note that this code could simply have done sv_setpvn(), but trim() goes it's own way in handling taint, inconsistent with the rest of perl, as implemented by sv_setpvn(), so we see this bug.
I did consider just replacing this code with sv_setpvn(), but I don't know if the difference from normal perl taint usage was intentional, I didn't see any mention of it in #19433, the PR that added trim().
perldelta: