-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
[Form] Do not trim unassigned unicode characters #43031
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thank you for the test case. Yes, I think, we should fix that. |
201150e
to
cdf1a42
Compare
cdf1a42
to
eaeb598
Compare
PCRE is compiled with Unicode tables, older versions have outdated tables. I don't know what we can do about this nor how we can work around it in a reasonable way... |
As far as I understand the problem it is about removing invisible characters, where the old PCRE version probably does this for all ranges it did not know at that time. |
cb559a8
to
5e92a30
Compare
14fe005
to
8e6763a
Compare
@nicolas-grekas I re-requested your approval because I removed the Surrogate and Private Use character categories as well. I think those should not be relevant or if they are, trimming would be more likely a wrong move. |
Thank you @simonberger. |
I faced a problem with StringUtil::trim in PHP <7.3 and added a test only (for now).
I probably should have created an issue instead but I thought it is better to look at the result of a test and make sure it actually is a problem.
It is caused by
\pC
from the preg_replace.I am no expert for UTF-8 whitespace and cannot suggest a good bugfix. Are you interested to work around the buggy behavior of the older PHP versions? Otherwise I'll close this.