-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
[HttpFoundation] add HeaderUtils::parseQuery()
: it does the same as parse_str()
but preserves dots in variable names
#37272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Ho nice ! We should add a test with spaces in the key parameter. |
return [ | ||
['a=b&c=d'], | ||
['a.b=c'], | ||
['a+b=c'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@drupol this is the test case with a space in the name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm glad I inspired you for this feature, I couldn't agree more with it.
It will help us at European Commission because we are using API Platform and CAS authentication.
Both are relying on query parameters and they cannot be changed during the request, this is reason why a middleware symfony/psr-http-message-bridge was used, so we could override it with loophp/unaltered-psr-http-message-bridge-bundle.
Once this feature will be merged, I guess it will help a bunch of people and remove some hacks here and there.
However, I really hope this behavior will be updated in PHP 8, I couldn't believe it was working as such when I discovered this issue. It's crazy!
I also tested the PR here: https://3v4l.org/VRCoh
The example compare the behavior of \parse_str() and the custom HeaderUtils::queryParseString()
you're introducing.
I did this example because I wanted to check why you were testing the a\0b
binary string, I guess this would unlikely happens in real world right ?
I'm also wondering if it wouldn't be easier to parse and convert in every key parameters with \bin2hex(), and then let \parse_str() do its job, and then convert back. It would be also maybe faster than going to all those if
conditions ? WDYT?
Also, do you think such a feature will be backported in Symfony < 5.2 ?
Yes, that's a very unlikely edge case of parse_str() that I duplicated for parity.
I'm not sure what you mean sorry. About perf, those if are going to be very fast, much faster than using a regexp. But please prove me wrong if you want to give your idea a try!
I don't think it should, that's a new feature. One should note also that I didn't change the createFromGlobals method. This means that dots will still be replaced by default. This is important to preserve BC. But at least ppl that care now have a helper to opt-in for the fixed parser. |
Would you be interested into making this function work also with a query string like this? https://3v4l.org/86Tu5 and return an array containing Since this is a valid URL http://localhost/myendpoint?test=what&test=what2 Sorry if it already does that and I missed it. related also with my comment here |
@gmponos I've been wondering about this also. This would be a too big API change when dealing with query/cookie/etc bags. But I wish we'll find a way to get there eventually yes (getting rid of |
The Go API is nice to handle these cases: https://golang.org/pkg/net/url/#URL.Query ["test" => ["what", "what2"]]
["foo" => ["bar"]] Maybe could this set of features be included in the proposed URI component (#36999)? |
ae5ab16
to
46cdb5d
Compare
I'm about to make a regex that does it all, I will submit my snippet tonight probably. This is the Father's day today and I might be away from computer... |
46cdb5d
to
8c45e5f
Compare
HeaderUtils::parseQueryString()
: it does the same as parse_str()
but preserves dots in variable namesHeaderUtils::parseQuery()
: it does the same as parse_str()
but preserves dots in variable names
8c45e5f
to
adcdf40
Compare
I'm not sure how this should play with the |
adcdf40
to
915a6c4
Compare
The |
For sure - also no RFC should cover what happens server-side. But we cannot migrate away from it without a serious FC/BC plan. |
The PHP mode could be the default (so it's 100% BC). The "standard-compliant" mode is useful only for a small subset of use cases after all (but for instance both Mercure and Vulcain, as well a many other I-D and RFCs use repeated query parameters without the |
Would this make sense to you? Then it would be up to users to use this function when they create the request object: --- a/src/Symfony/Component/HttpFoundation/HeaderUtils.php
+++ b/src/Symfony/Component/HttpFoundation/HeaderUtils.php
@@ -196,7 +196,7 @@ class HeaderUtils
/**
* Like parse_str(), but preserves dots in variable names.
*/
- public static function parseQuery(string $query, string $separator = '&'): array
+ public static function parseQuery(string $query, bool $phpMode = true, string $separator = '&'): array
{
$q = [];
@@ -217,6 +217,12 @@ class HeaderUtils
$k = substr($k, 0, $i);
}
+ if (!$phpMode) {
+ $q[$k][] = urldecode(substr($v, 1));
+
+ continue;
+ }
+
$k = ltrim($k, ' ');
if (false === $i = strpos($k, '[')) {
@@ -226,6 +232,10 @@ class HeaderUtils
}
}
+ if (!$phpMode) {
+ return $q;
+ }
+
parse_str(implode('&', $q), $q);
$query = []; |
LGTM |
Here I am, I just made a small example with a Basically, the regex will convert the relevant part of the keys to hexadecimal, then let parse_str() do its job, then convert back. There is only one test which produces ( |
915a6c4
to
36dd4aa
Compare
Ok I fixed it... https://3v4l.org/Pam6Z But ok if it's slower, you decide :-) |
This line will lead to double decoding of the value (once here + twice done by |
Well done, I updated it: https://3v4l.org/p7uPM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
This change heads in a really good direction. Thank you. I have some random comments and questions.
|
/** | ||
* Like parse_str(), but preserves dots in variable names. | ||
*/ | ||
public static function parseQuery(string $query, bool $ignoreBrackets = false, string $separator = '&'): array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we just use the native parse_str()
if $ignoreBracket
is set to false
? And so rename
$ignoreBracketsto something else such as
$phpCompat = true` or something like that?
Many PHP libraries rely on the fact that dot are replaced by underscores (CGI-like) and we may introduce issues by changing this. I would prefer to have a pure PHP mode (with all the oddities, including the dots replacement etc), and a "strict" mode doing nothing more than the URL
class in JS for instance. It feels wrong to me to have some intermediary modes such as "replace the dots but ignore brackets".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry I'm missing your logic. To me, it feels wrong to parse dots (replacing them with _
) but ignore brackets.
The purpose of the method is to give you access to the original names and have a toggle to accept multiple values with or without using []
.
Libs that expect keys mangled by php should be given arrays mangled by php, ie you wouldn't use this function with them.
36dd4aa
to
9234fae
Compare
I noticed that Firefox and Chrome behaves differently when it comes to parse the string Try this javascript code: const queryString = new URLSearchParams('?a%00b=c');
for (const [key, value] of queryString) {
console.log('key =', key, 'value = ', value);
} Shouldn't we mimic that behavior here? We could also check WHATWG doc about this ? And we could also check what the webplatform test suite cover those cases ? |
@drupol there is no use case for the null char in URLs. I'm not even sure it's legal from an RFC pov. php-src uses C-strings internally in |
9234fae
to
4440d02
Compare
Now rebased on top of #37271, ready. |
9f6ac40
to
0c00d95
Compare
… `parse_str()` but preserves dots in variable names
0c00d95
to
dd81e32
Compare
@simonberger I missed you comment:
fixed, trimming now happens unconditionally.
true, but nobody is using spaces in var names so I skipped advertising this.
naming... :) |
Thank you @nicolas-grekas. |
Inspired by symfony/psr-http-message-bridge#80
/cc @drupol
Related to #9009, #29664, #26220 but also api-platform/core#509 and https://www.drupal.org/project/drupal/issues/2984272
/cc @dunglas @alexpott