Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@emazack
Copy link

@emazack emazack commented Oct 16, 2025

Closes #12008

Summary

This PR fixes a bug in the canonical audit where some syntactically invalid URLs (e.g., those with a space in the host like https:// example.com) were incorrectly being classified as relative URLs.

The previous validation logic was insufficient because it didn't correctly handle cases where the browser's link parser would return a non-null, but still malformed, href string.

This change replaces the old logic with a more robust, two-step validation process inside a nested try...catch block.

  1. It first checks if the hrefRaw is syntactically valid at all by attempting to construct a URL with a base. If this fails, the URL is correctly marked as invalid.
  2. If the first check passes, it then attempts to construct a URL without a base. If this second check fails, the URL is correctly marked as relative.
  3. If both checks pass, the URL is correctly identified as a valid, absolute URL.

To support this fix, the existing unit tests have been improved:

  • The test for invalid URLs, which was passing for the wrong reasons, has been updated to correctly expose the bug.
  • A new test case has been added to explicitly verify the href: null scenario, improving the overall robustness of the test suite.

All related unit tests now pass with these changes.

@emazack emazack requested a review from a team as a code owner October 16, 2025 15:30
@emazack emazack requested review from paulirish and removed request for a team October 16, 2025 15:30
@google-cla
Copy link

google-cla bot commented Oct 16, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@emazack emazack changed the title fix(seo): improve validation for invalid canonical URLs core(seo): improve validation for invalid canonical URLs Oct 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Canonical URL audit mistakes invalid URLs for relative ones

2 participants