Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Check URI Scheme validity according to RFC3986#8090

Open
GwynethLlewelyn wants to merge 3 commits intogogs:mainfrom
GwynethLlewelyn:add-xmpp
Open

Check URI Scheme validity according to RFC3986#8090
GwynethLlewelyn wants to merge 3 commits intogogs:mainfrom
GwynethLlewelyn:add-xmpp

Conversation

@GwynethLlewelyn
Copy link

Issue description

XMPP URI Schemes were defined and standardised by RFC5122, duly registered with IANA since 2008 (with permanent status), and fully conforming with the generic URI Scheme as defined by RFC3986, Section 3.1.

However, the current code base only checks for a subset of possible valid URIs using the regular expression:

^[a-z][\w-]+://|^mailto:

This ad hoc implementation unfortunately does not conform with the URI Schemes specification.

Most notably, it explicitly excludes all URIs without an explicit authority element (i.e., what follows //). But such schemes are perfectly valid; in fact, the above regular expression requires an additional element to be checked, namely mailto:, which, by convention, skips the authority element, like many others, including XMPP.

Furthermore, by using \w, the regex as implemented allows not only mixing lowercase and uppercase letters (URI Schemes are not case sensitive, although all RFCs strongly recommend that canonicalised URIs should be lowercase only), but also allows the underscore character _, which is, however, not allowed by the RFCs specifying the URI Schemes. This means that (imaginary) URI Schemes as fake_scheme://this.blows.up.everything are accepted with the regular expression in the code, although such schemes are invalid according to the RFCs; and, on the other hand, perfectly valid URI schemes such as tel:+1-555-1234 or xmpp:[email protected] are rejected (both of which, incidentally, have been fully registered with IANA for a long time).

As such, this simple PR proposes to correct the regular expression so that it conforms to the actual rules specified by the RFCs, accepting all IANA-registered schemes, while at the same time guaranteeing that invalid schemes are properly rejected.

This PR closes #6732.

(Special thanks to @Neustradamus for first reporting this issue exactly four years ago and having patiently waited for a solution since then!)

Checklist

  • I agree to follow the Code of Conduct by submitting this pull request.
  • I have read and acknowledge the Contributing guide.
  • I have added test cases to cover the new code or have provided the test plan. (if applicable)
  • I have added an entry to CHANGELOG. (if applicable)

Proposed solution and test plan

The ABNF specifying all valid URI Schemes is:

scheme        = alpha *( alpha | digit | "+" | "-" | "." )

References:

This corresponds to the regular expression1:

^[a-z][a-z\-\+.]+:(?:\/\/)? 

which was tested on Regex101: https://regex101.com/r/HEpzIM/1
and the Go Playground: https://go.dev/play/p/vtYEugsNAfo

No further changes beyond changing the existing regular expression in markdown.go are necessary to fully validate URI Schemes according to RFC3986.

Footnotes

  1. Note that, although the specifications allow for a mix of lowercase and uppercase US-ASCII letters, it also strongly recommends that URIs are canonicalised in lowercase, which is assumed here to be the case.

@unknwon
Copy link
Member

unknwon commented Jan 17, 2026

Hey @GwynethLlewelyn, thanks for the PR!

Could you:

  1. Add test cases to sanity check the regex is actually matching all of them. And some counter-cases as well, make sure it doesn't over matching obviously non-URL strings.
  2. I think this PR worth as CHANGELOG entry.

Add Nova and IDEA editor configurations to .gitignore
Copy link
Member

@unknwon unknwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add XMPP URI support (RFC5122)

2 participants