Selector: Decode invalid escape code points to U+FFFD#5845
Conversation
|
The workflow failures are unrelated. Chrome has been updated in CI and the failures we saw in beta related to |
gibson042
left a comment
There was a problem hiding this comment.
Thanks! This should be good to go after a few small tweaks.
| var codePoint = parseInt( escape.slice( 1 ), 16 ); | ||
|
|
||
| if ( nonHex ) { | ||
|
|
||
| // Strip the backslash prefix from a non-hex escape sequence | ||
| return nonHex; | ||
| } | ||
|
|
||
| // Per the CSS spec, a NULL, surrogate, or out-of-range code point is | ||
| // replaced with the REPLACEMENT CHARACTER (U+FFFD). | ||
| if ( codePoint === 0 || codePoint > 0x10FFFF || | ||
| ( codePoint >= 0xD800 && codePoint <= 0xDFFF ) ) { | ||
| return "\uFFFD"; | ||
| } |
There was a problem hiding this comment.
gzippability improvements:
| var codePoint = parseInt( escape.slice( 1 ), 16 ); | |
| if ( nonHex ) { | |
| // Strip the backslash prefix from a non-hex escape sequence | |
| return nonHex; | |
| } | |
| // Per the CSS spec, a NULL, surrogate, or out-of-range code point is | |
| // replaced with the REPLACEMENT CHARACTER (U+FFFD). | |
| if ( codePoint === 0 || codePoint > 0x10FFFF || | |
| ( codePoint >= 0xD800 && codePoint <= 0xDFFF ) ) { | |
| return "\uFFFD"; | |
| } | |
| var codePoint = "0x" + escape.slice( 1 ) - 0; | |
| if ( nonHex ) { | |
| // Strip the backslash prefix from a non-hex escape sequence | |
| return nonHex; | |
| } | |
| // Per the CSS spec, a NULL, surrogate, or out-of-range code point is | |
| // replaced with the REPLACEMENT CHARACTER (U+FFFD). | |
| // https://www.w3.org/TR/css-syntax-3/#consume-escaped-code-point | |
| if ( !codePoint || codePoint > 0x10FFFF || | |
| ( codePoint >= 0xD800 && codePoint < 0xE000 ) ) { | |
| return "\uFFFD"; | |
| } |
There was a problem hiding this comment.
applied. went with the "0x" + escape.slice( 1 ) - 0 form, !codePoint, and < 0xE000, and added the spec link.
| return codePoint > 0xFFFF ? | ||
| String.fromCharCode( | ||
| ( codePoint - 0x10000 ) >> 10 | 0xD800, | ||
| ( codePoint - 0x10000 ) & 0x3FF | 0xDC00 | ||
| ) : | ||
| String.fromCharCode( codePoint ); |
There was a problem hiding this comment.
gzippability improvements:
| return codePoint > 0xFFFF ? | |
| String.fromCharCode( | |
| ( codePoint - 0x10000 ) >> 10 | 0xD800, | |
| ( codePoint - 0x10000 ) & 0x3FF | 0xDC00 | |
| ) : | |
| String.fromCharCode( codePoint ); | |
| return codePoint < 0x10000 ? | |
| String.fromCharCode( codePoint ) : | |
| String.fromCharCode( | |
| ( codePoint - 0x10000 ) >> 10 | 0xD800, | |
| ( codePoint - 0x10000 ) & 0x3FF | 0xDC00 | |
| ); |
There was a problem hiding this comment.
done, BMP branch first now.
| "Long numeric escape (non-BMP)" ); | ||
| } ); | ||
|
|
||
| QUnit.test( "attributes - invalid escaped code points", function( assert ) { |
There was a problem hiding this comment.
Let's also include a test case with complete and off-by-one coverage, e.g. that [data-attr='\0 \1 \D7FF \D800 \DFFF \E000 \10FFFF \110000'] matches an element with attribute value "\uFFFD\u0001\uD7FF\uFFFD\uFFFD\uE000\uDBFF\uDFFF\uFFFD".
There was a problem hiding this comment.
added it as a fifth assertion: seeded an element with value ������ and matched it with the full \0 \1 \D7FF \D800 \DFFF \E000 \10FFFF \110000 list, so both sides of each boundary are covered.
52d7b8a to
694ba83
Compare
unescapeSelectordecodes CSS hex escapes that the spec maps to U+FFFD wrong:Return U+FFFD for null, surrogate, and out-of-range escapes. Reachable via the seeded
.filter()matcher; test added alongside the others.