-
Notifications
You must be signed in to change notification settings - Fork 87
refactor: in text signatures, don't normalize single \r characters
#650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
8414c56 to
499ac48
Compare
src/packet/literal_data.rs
Outdated
| // So we skip all strings that don't fit this goal. | ||
| // | ||
| // Test strings for this case need to contain a LF. | ||
| // And they must not normalize to legal CR+LF (like CR+CR+LF) does. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this comment. CR+CR+LF might be the most contentious part of normalization, as it distinguishes the implementers who treat \r?\n as the line ending from those who treat \r*\n as the line ending.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is a "fuzzing" style test. It attempts to produce random messages that are guaranteed to have illegal line endings (and then verifies that CrLfCheckReader does indeed throw an error while consuming that message).
It does that by first filtering random strings as candidates, and then normalizing CR+LF to just LF for all candidates that we expect will indeed have illegal line endings after this step (this "counter-"normalization happens in the next statement, which reads let norm = normalize_lines(&string, LineBreak::Lf);)
The check you're looking at is throwing away random strings that are not going to trigger an error, because they either:
- contain no LF at all (those can't possibly be illegal in terms of their line ending normalization)
- contain CR+CR+LF, which would get normalized to just CR+LF (which is legal, so it's not an input that we want to test in this particular test function - we want to randomly construct illegal inputs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And just as a tiny bit more context (even though it's entirely separate from what you asked for):
This test wraps the (pesudo-random) test messages in a ChaosReader before trying to process it with the CrLfCheckReader (which is the actual subject of this test). ChaosReader is a Reader that returns the message chopped up into (pseudo-)random slices with lengths between one byte and the full message length, each read call returns a (pseudo-)random number of bytes..
This exercises handling of "unfortunate" read lengths (such as right in the middle of a CR+LF pair) in the reader that gets tested (again, here: CrLfCheckReader).
By turning up the number of iterations of the test, this test method has proven pretty thorough in finding missed edge cases in code like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess the point i was trying to make is that while GnuPG might transform \r\r\n to a "legal line ending", it's not at all clear that this is actually what "normalizing" means in the context of an OpenPGP text message. So i think the comment could be improved, unless you're trying to make some sort of assertion that OpenPGP line ending normalization actually is \r*\n → \r\n
I really do appreciate the ChaosReader fuzzing approach for this normalization silliness, i've seen more than i would like of that sort of breakage, so i'm glad you're checking for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify the test some more: The transformation that is used as part of preparing error-cases in this test is actually a "denormalization", which just replaces every occurrence of \r\n with \n.
This particular transformation is (I think) not used in any production code path within rPGP. It's just used in this test to produce error test cases. Those cases are then thrown at the CrLfCheckReader, to make sure it rejects all manner of strings with illegal line-endings.
The comment you're looking at is effectively saying: We're trying to produce strings that will have unacompanied \n characters (that do not form proper \r\n pairs). I'll clarify the comment to say that.
Independently, I'm starting to see your point that different software might have more varies ideas of what it means to "normalize" line endings than I imagined. That is a very disturbing point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pushed a more verbose comment. I am still a bit unsure if I'm missing some point, but I am relatively confident that the test does what it should.
Fixes #649