-
Notifications
You must be signed in to change notification settings - Fork 7.9k
ext/pcre - enable /n modifier #7583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Could you please also add a test? |
Yes, I've added it. |
I'm in favor of this, but want to point out that one can already set the respective internal option. |
Co-authored-by: Christoph M. Becker <[email protected]>
string(3) "abc" | ||
["test"]=> | ||
string(1) "b" | ||
[1]=> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is producing a numbered capture "passing" when that's what the n
modifier is supposed to suppress? From the second call, I'd expect this output:
array(3) {
[0]=>
string(3) "abc"
["test"]=>
string(1) "b"
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Roy, this is same behavior according to Perl.
$ echo "abc" | perl -ne 'my @a = $_ =~ /.(?P<a>.)/n;print $1;'
b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, but that's a pity since Perl provides a separate hash for named groups, but PHP gets the numbered and named groups lumped in the same array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, but that's a pity since Perl provides a separate hash for named groups, but PHP gets the numbered and named groups lumped in the same array.
Not a problem there. PCRE behaves in this way, PHP is not creating this numbered entry. See PCRE2' documentation:
PCRE2_NO_AUTO_CAPTURE
If this option is set, it disables the use of numbered capturing paren-
theses in the pattern. Any opening parenthesis that is not followed by
? behaves as if it were followed by ?: but named parentheses can still
be used for capturing (and they acquire numbers in the usual way). This
is the same as Perl's /n option. Note that, when this option is set,
references to capture groups (backreferences or recursion/subroutine
calls) may only refer to named groups, though the reference can be by
name or by number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I understood that PCRE returns matches with both numbered and named references, but surely it has always been PHP's choice to put both of those reference sets into the one array and not provide a way to conditionally omit/separate the numbered set, has it not? This is unlike Perl which puts the named set into its own hash:
echo abc | perl -ne '/(.)(?<x>.)(.)/n; $s = keys %+; print "$s\n"'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PCRE docs seems pretty clear cut on what the intention for this flag is ("and they acquire numbers in the usual way" and "though the reference can be by name or by number"). I don't see reason to deviate from that.
Maybe there is room here for another modifier or option to only return named captures, but this modifier clearly isn't it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good from my side at least.
Implement bug report #81439 (feature request)