Broken Links
What are Broken Links?
To start with, a link is an HTML object that enables users to migrate from one web page to another when they click on it.
It is a means to navigate between different web pages on the internet.
A broken link, also often called a dead link, is one that does not work i.e. does not redirect to the webpage it is meant
to. This usually occurs because the website or particular web page is down or does not exist. When someone clicks on a
broken link, an error message is displayed.
Broken links may exist due to some kind of server error, which, in turn, causes the corresponding page to malfunction
and not be displayed. A valid URL will have a 2xx HTTP status code. Broken links, which are essentially invalid HTTP
requests have 4xx and 5xx status codes.
The 4xx status code refers to a client-side error, while the 5xx status code usually points to a server response error.
HTTP Status Code Definition
400 (Bad Request) Server unable to process request as URL is incorrect
400 (Bad Request – Bad Host) Server unable to process request as host name is invalid
400 (Bad Request – Bad URL): Server cannot process request as the URL is of incorrect
format; missing characters like brackets, slashes, etc.
400 (Bad Request – Empty) Response returned by the server is empty with no content
& no response code
400 (Bad Request – Timeout) HTTP requests have timed out
400 (Bad Request – Reset) Server is unable to process the request, as it is busy
processing other requests or has been misconfigured by
site owner
404 (Page Not Found) Page is not available on the server
403 (Forbidden) Server refuses to fulfill the request as authorization is
required
410 (Gone) Page is gone. This code is more permanent than 404
408 (Request Time Out) Server has timed-out waiting for the request.
503 (Service Unavailable) Server is temporarily overloaded and cannot process the
request
How to identify broken links in Selenium WebDriver
To check broken links in Selenium, the process is simple. On a web page, hyperlinks are implemented using the
HTML Anchor (<a>) tag. All the script needs to do is to locate every anchor tag on a web page, get the
corresponding URLs, and run through the links to check if any of them are broken.
Use the following steps to identify broken links in Selenium
1.Collect all the links present on a web page based on the <a> tag
2.Send HTTP request for each link
3.Verify the HTTP response code
4.Determine if the link is valid or broken based on the HTTP response code
5.Repeat the process for all links captured with the first step