Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Percent-decoding does not accept non-ASCII octets #118

@mtrenkmann

Description

@mtrenkmann

When building a network::uri object using network::uri_builder, query parameters that contain non-ASCII multibyte characters (e.g. UTF-8) are percent-encoded as expected. For example, http://example.com/q=법정동 becomes http://example.com/q=%EB%B2%95%EC%A0%95%EB%8F%99.

However, the other way around, when applying network::uri::decode to the encoded query parameter, a percent_decoding_error exception is thrown. I think this behavior is incorrect. According to RFC 3986 section 2.5 percent-encoding and decoding work at octet-level and should be otherwise agnostic about character encodings.

Suggested fix in network/uri/detail/decode.hpp:

-  if (h0 >= '8') {
-    // unable to decode characters outside the ASCII character set.
-    throw percent_decoding_error(uri_error::conversion_failed);
-  }

Unit tests for reproduction:

  • Percent-encoding a UTF-8 query parameter works
  • Percent-decoding a UTF-8 query parameter does not work
TEST(UriBuilderTest, PercentEncodingAcceptsNonAsciiOctets) {
  const std::string decoded = u8"법정동";
  const std::string encoded = "%EB%B2%95%EC%A0%95%EB%8F%99";

  network::uri_builder ub(network::uri("http://example.com"));
  ASSERT_NO_THROW(ub.append_query_key_value_pair("q", decoded));

  const network::uri uri = ub.uri();
  ASSERT_EQ(network::string_view(encoded), uri.query_begin()->second);
}

TEST(UriDecodeTest, PercentDecodingAcceptsNonAsciiOctets) {
  const std::string decoded = u8"법정동";
  const std::string encoded = "%EB%B2%95%EC%A0%95%EB%8F%99";

  std::string output;
  ASSERT_NO_THROW(network::uri::decode(encoded.begin(), encoded.end(),
                                       std::back_inserter(output)));
  ASSERT_EQ(decoded, output);
}

Output:

[==========] Running 2 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 1 test from UriBuilderTest
[ RUN      ] UriBuilderTest.PercentEncodingAcceptsNonAsciiOctets
[       OK ] UriBuilderTest.PercentEncodingAcceptsNonAsciiOctets (0 ms)
[----------] 1 test from UriBuilderTest (1 ms total)

[----------] 1 test from UriDecodeTest
[ RUN      ] UriDecodeTest.PercentDecodingAcceptsNonAsciiOctets
src/uri_test.cc:53: Failure
Expected: network::uri::decode(encoded.begin(), encoded.end(), std::back_inserter(output)) doesn't throw an exception.
  Actual: it throws.
[  FAILED  ] UriDecodeTest.PercentDecodingAcceptsNonAsciiOctets (0 ms)
[----------] 1 test from UriDecodeTest (0 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 2 test cases ran. (1 ms total)
[  PASSED  ] 1 test.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] UriDecodeTest.PercentDecodingAcceptsNonAsciiOctets

 1 FAILED TEST

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions