-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[libc++][test] Don't pass ill-formed UTF-8 to MAKE_STRING_VIEW #136403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-libcxx Author: S. B. Tam (cpplearner) ChangesThe tests These tests don't actually need to produce a u8 string literal. (In fact, the affected lines are exercised only if Full diff: https://github.com/llvm/llvm-project/pull/136403.diff 2 Files Affected:
diff --git a/libcxx/test/std/utilities/format/format.functions/escaped_output.unicode.pass.cpp b/libcxx/test/std/utilities/format/format.functions/escaped_output.unicode.pass.cpp
index c4adf601c40af..eb27c70954664 100644
--- a/libcxx/test/std/utilities/format/format.functions/escaped_output.unicode.pass.cpp
+++ b/libcxx/test/std/utilities/format/format.functions/escaped_output.unicode.pass.cpp
@@ -337,7 +337,7 @@ void test_string() {
// Ill-formed
if constexpr (sizeof(CharT) == 1)
- test_format(SV(R"("\x{80}")"), SV("{:?}"), SV("\x80"));
+ test_format(SV(R"("\x{80}")"), SV("{:?}"), "\x80");
// *** P2713R1 examples ***
test_format(SV(R"(["\u{301}"])"), SV("[{:?}]"), SV("\u0301"));
diff --git a/libcxx/test/std/utilities/format/format.functions/fill.unicode.pass.cpp b/libcxx/test/std/utilities/format/format.functions/fill.unicode.pass.cpp
index cd555e1ab9ce8..76f756ae91483 100644
--- a/libcxx/test/std/utilities/format/format.functions/fill.unicode.pass.cpp
+++ b/libcxx/test/std/utilities/format/format.functions/fill.unicode.pass.cpp
@@ -75,30 +75,40 @@ void test() {
// Invalid Unicode Scalar Values
if constexpr (std::same_as<CharT, char>) {
- check_exception("The format specifier contains malformed Unicode characters", SV("{:\xed\xa0\x80^}"), 42); // U+D800
- check_exception("The format specifier contains malformed Unicode characters", SV("{:\xed\xa0\xbf^}"), 42); // U+DBFF
- check_exception("The format specifier contains malformed Unicode characters", SV("{:\xed\xbf\x80^}"), 42); // U+DC00
- check_exception("The format specifier contains malformed Unicode characters", SV("{:\xed\xbf\xbf^}"), 42); // U+DFFF
+ check_exception("The format specifier contains malformed Unicode characters",
+ std::string_view{"{:\xed\xa0\x80^}"},
+ 42); // U+D800
+ check_exception("The format specifier contains malformed Unicode characters",
+ std::string_view{"{:\xed\xa0\xbf^}"},
+ 42); // U+DBFF
+ check_exception("The format specifier contains malformed Unicode characters",
+ std::string_view{"{:\xed\xbf\x80^}"},
+ 42); // U+DC00
+ check_exception("The format specifier contains malformed Unicode characters",
+ std::string_view{"{:\xed\xbf\xbf^}"},
+ 42); // U+DFFF
- check_exception(
- "The format specifier contains malformed Unicode characters", SV("{:\xf4\x90\x80\x80^}"), 42); // U+110000
- check_exception(
- "The format specifier contains malformed Unicode characters", SV("{:\xf4\x90\xbf\xbf^}"), 42); // U+11FFFF
+ check_exception("The format specifier contains malformed Unicode characters",
+ std::string_view{"{:\xf4\x90\x80\x80^}"},
+ 42); // U+110000
+ check_exception("The format specifier contains malformed Unicode characters",
+ std::string_view{"{:\xf4\x90\xbf\xbf^}"},
+ 42); // U+11FFFF
check_exception("The format specifier contains malformed Unicode characters",
- SV("{:\x80^}"),
+ std::string_view{"{:\x80^}"},
42); // Trailing code unit with no leading one.
check_exception("The format specifier contains malformed Unicode characters",
- SV("{:\xc0^}"),
+ std::string_view{"{:\xc0^}"},
42); // Missing trailing code unit.
check_exception("The format specifier contains malformed Unicode characters",
- SV("{:\xe0\x80^}"),
+ std::string_view{"{:\xe0\x80^}"},
42); // Missing trailing code unit.
check_exception("The format specifier contains malformed Unicode characters",
- SV("{:\xf0\x80^}"),
+ std::string_view{"{:\xf0\x80^}"},
42); // Missing two trailing code units.
check_exception("The format specifier contains malformed Unicode characters",
- SV("{:\xf0\x80\x80^}"),
+ std::string_view{"{:\xf0\x80\x80^}"},
42); // Missing trailing code unit.
#ifndef TEST_HAS_NO_WIDE_CHARACTERS
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Although the PR description looks a bit outdated. CWG1656 has been resolved by P2029R4, and now it's specified that each of \x80
..\xff
in a u8 string literal exactly produces a single char8_t
array element. Before P2029R4 such use seemed to be ill-formed, but old versions of compilers used to silently accept it with different meanings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks sensible, but I'd really like to have @mordante's input on this before merging.
ping @mordante |
The tests
escaped_output.unicode.pass.cpp
andfill.unicode.pass.cpp
useSV
(which expands toMAKE_STRING_VIEW
) to create a string view ofCharT
.MAKE_STRING_VIEW
internally creates a u8 string literal, which is potentially non-portable when there's a numeric escape sequence (see CWG 1656). Latest MSVC preview (v17.14.0-pre.3.0) produces warning C5321 for this.These tests don't actually need to produce a u8 string literal. (In fact, the affected lines are exercised only if
CharT
ischar
.) It seems possible to simply avoidSV
in these places.