Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[Form] Do not trim unassigned unicode characters #43031

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 17, 2021

Conversation

simonberger
Copy link
Contributor

@simonberger simonberger commented Sep 14, 2021

Q A
Branch? 4.4
Bug fix? yes
New feature? no
Deprecations? no
License MIT

I faced a problem with StringUtil::trim in PHP <7.3 and added a test only (for now).
I probably should have created an issue instead but I thought it is better to look at the result of a test and make sure it actually is a problem.

It is caused by \pC from the preg_replace.
I am no expert for UTF-8 whitespace and cannot suggest a good bugfix. Are you interested to work around the buggy behavior of the older PHP versions? Otherwise I'll close this.

@carsonbot carsonbot added this to the 4.4 milestone Sep 14, 2021
@derrabus derrabus added the Form label Sep 14, 2021
@carsonbot carsonbot changed the title Add StringUtil::trim test case for an emoji [Form] Add StringUtil::trim test case for an emoji Sep 14, 2021
@simonberger simonberger changed the title [Form] Add StringUtil::trim test case for an emoji [Form] Add failing StringUtil::trim test case trimming an emoji Sep 14, 2021
@derrabus derrabus added the Help wanted Issues and PRs which are looking for volunteers to complete them. label Sep 14, 2021
@derrabus
Copy link
Member

Thank you for the test case. Yes, I think, we should fix that.

@nicolas-grekas
Copy link
Member

PCRE is compiled with Unicode tables, older versions have outdated tables. I don't know what we can do about this nor how we can work around it in a reasonable way...
Any idea?

@simonberger
Copy link
Contributor Author

As far as I understand the problem it is about removing invisible characters, where the old PCRE version probably does this for all ranges it did not know at that time.
Can we instead use some ranges of really invisible characters instead of \pC? Or do this just for PHP <7.3 if it is safe for the other versions.

@simonberger simonberger changed the title [Form] Add failing StringUtil::trim test case trimming an emoji [Form] Do not trim unassigned unicode characters Sep 15, 2021
@simonberger
Copy link
Contributor Author

@nicolas-grekas I re-requested your approval because I removed the Surrogate and Private Use character categories as well. I think those should not be relevant or if they are, trimming would be more likely a wrong move.

@nicolas-grekas
Copy link
Member

Thank you @simonberger.

@nicolas-grekas nicolas-grekas merged commit 6fdf0c9 into symfony:4.4 Sep 17, 2021
@fabpot fabpot mentioned this pull request Sep 28, 2021
@fabpot fabpot mentioned this pull request Sep 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Form Help wanted Issues and PRs which are looking for volunteers to complete them. Status: Needs Review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants