-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
[Validator] Add the Charset
constraint
#53154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Validator] Add the Charset
constraint
#53154
Conversation
2562ec4
to
5aed25d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I second this, exactly the day our production crashed on some file upload ^^
(excel file wrongly saved with bad encoding)
5aed25d
to
3665f61
Compare
Encoding
constraintCharacterEncoding
constraint
Just bikeshadding about the name: what about Charset? |
3665f61
to
0466045
Compare
I like Charset very much! I find it clearer. I updated the PR. I kept the |
0466045
to
7762092
Compare
CharacterEncoding
constraintCharset
constraint
My thoughts: On all my apps, I'd want to reject any non-UTF-8 content. It might already be the case when the DB refuses to persist invalid UTF-8. So this contraints looks both needed and not needed since I think we might want to add this check earlier when processing the input. @alexandre-daubois can you share your use case more precisely? When can one submit non-UTF-8 in your case? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we receive some file contents in our DTOs
hum, actually you already answered that sorry :)
here are some minor comments, LGTM
src/Symfony/Component/Validator/Constraints/CharsetValidator.php
Outdated
Show resolved
Hide resolved
src/Symfony/Component/Validator/Tests/Constraints/CharacterEncodingValidatorTest.php
Outdated
Show resolved
Hide resolved
7762092
to
5220858
Compare
Thank you for the review, comments are addressed 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe could you make the new classes final
?
5220858
to
e084246
Compare
|
Thank you @alexandre-daubois. |
There is a small problem in the implementation, i poke you there but if you prefer i can create a full issue As you use the "valid" encodings in your call to mb_detect_encodings, if the value is not encoded in one of those values, the function will always return false. Meaning you'll never be able to display anything in the {{detected}} value of the message.
With '😊' and ['ASCII'], the generated message will be
Oh and just another (minor) thing.. the mb_string extension seems to not be required by Validator, but the mb_string polyfill does only check ASCII, UTF-8 and latin-ish encodings. Maybe something to add in doc, or to check at runtime ? |
Hi! Thank you for letting me know Simon, I'll provide a fix for this 🙂 |
…dator` (alexandre-daubois) This PR was merged into the 7.1 branch. Discussion ---------- [Validator] Fix charset encoding detection in `CharsetValidator` | Q | A | ------------- | --- | Branch? | 7.1 | Bug fix? | yes | New feature? | no | Deprecations? | no | Issues | Fix #53154 (comment) | License | MIT After `@smnandre` suggestion, I updated the constraint to display a better message on the detected encoding. Indeed, if we fail to check the encoding is in the provided ones, then we fall back on any encoding detected by mbstring. Commits ------- 6393e29 [Validator] Fix charset encoding detection in `CharsetValidator`
Why the constraint is named It even uses the |
Can you please open a new issue to discuss this before the 7.2 release? Thanks |
@OskarStark This feature has been shipped with 7.1 already. |
Indeed, time flies 😄 |
Our use case: we receive some file contents in our DTOs that we only want to process if their encoding matches UTF-8 and reject the whole thing at validation otherwise.