Thanks to visit codestin.com
Credit goes to github.com

Skip to content

test: add exhaustive uri format cases across drafts#895

Open
AcEKaycgR wants to merge 1 commit intojson-schema-org:mainfrom
AcEKaycgR:exhaustive-uri-suite
Open

test: add exhaustive uri format cases across drafts#895
AcEKaycgR wants to merge 1 commit intojson-schema-org:mainfrom
AcEKaycgR:exhaustive-uri-suite

Conversation

@AcEKaycgR
Copy link
Copy Markdown
Contributor

This PR expands the uri format test suite across all drafts. The expansion provides complete coverage of RFC 3986 ABNF and prose constraints. Every test was traced directly against RFC 3986 , not derived from implementation output.

Technical Coverage Details

The new cases cover every structural dimension of the grammar:

  • Scheme: Valid charset, first-character rule, upper and mixed case valid (ALPHA covers %x41-5A / %x61-7A), rejection of invalid characters (~, =, /, space), %-encoding forbidden in scheme, empty-before-colon, and missing colon.
  • Percent-encoding: Valid uppercase/lowercase/mixed HEXDIG triplets, double-encoding, coverage in all six URI components, rejection of non-HEXDIG, lone %, incomplete triplets, and %-encoding forbidden in port.
  • Userinfo: Sub-delimiters and unreserved characters, colon-delimited forms, empty userinfo, strict first-@ split (guards against last-@ scanning bugs), slash and ? boundary tests.
  • Host: reg-name fallback for malformed IPv4 per §3.2.2; all five dec-octet alternative floor/ceiling values (0–9, 10–99, 100–199, 200–249, 250–255); full/compressed/loopback/embedded-IPv4 IPv6 forms; trailing :: form; IPvFuture with uppercase V; rejection of two ::, >4 hex digits in h16, wrong group counts, misformed IPvFuture, and non-ASCII in host.
  • Port: *DIGIT - empty port, leading zeros, no upper bound, rejection of +, -, space, ., alpha characters, pct-encoded digits, and Unicode decimal digits.
  • Path: All five path forms; full pchar charset; rejection of gen-delims and forbidden characters ({, }, ^, `, |, \, ", <, >), control characters, and non-ASCII; dot and double-dot segments.
  • Query / Fragment: pchar + / + ?; at-sign valid in both; rejection of [, ], {, ^, space, and second # in fragment; pct-encoding valid/invalid; empty and absent forms.
  • URI vs URI-reference: Protocol-relative (//) and path-only relative-refs correctly rejected , format: "uri" requires an absolute URI with scheme.
  • Whitespace / composites: Empty string, single space/tab/newline, leading/trailing whitespace, embedded control characters, trailing garbage, and a 2084-character URI (RFC 3986 has no length limit).

Standards & Traceability

Following the style of ipv4.json, non-obvious tests carry a comment field mentioning the relevant RFC 3986 section and ABNF production.

Triangulation note: Spot-checked against ajv-formats and python-jsonschema-format. All mismatches reflect implementation gaps against strict RFC 3986 ABNF, not errors in the test vectors.

Feedback appreciated @jviotti @jdesrosiers @karenetheridge

@AcEKaycgR AcEKaycgR requested a review from a team as a code owner April 19, 2026 09:18
@AcEKaycgR AcEKaycgR force-pushed the exhaustive-uri-suite branch from f4660d1 to 9407fb8 Compare April 19, 2026 09:26
Copy link
Copy Markdown
Member

@jviotti jviotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent some time today on this. Lots of tests, but as far as I could tell, all of them are RFC compliant. I couldn't catch anything non compliant, though I would love reviews from other TSC members knowledgeable in URIs, like @jdesrosiers (as you also have your own URI implementation).

Given the low coverage we have on these things, I think this is a great addition, and we can always continue incrementally polishing it with more edge cases.

@AcEKaycgR
Copy link
Copy Markdown
Contributor Author

AcEKaycgR commented Apr 20, 2026

Thanks @jviotti Will keep looking for more edge cases and open up PR as things surface. Also starting to work through the other formats simultaneously .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants