Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Automated link-checker for system testing [closed]

I often have to work with fragile legacy websites that break in unexpected ways when logic or configuration are updated.

I don't have the time or knowledge of the system needed to create a Selenium script. Besides, I don't want to check a specific use case - I want to verify every link and page on the site.

I would like to create an automated system test that will spider through a site and check for broken links and crashes. Ideally, there would be a tool that I could use to achieve this. It should have as many as possible of the following features, in descending order of priority:

  • Triggered via script
  • Does not require human interaction
  • Follows all links including anchor tags and links to CSS and js files
  • Produces a log of all found 404s, 500s etc.
  • Can be deployed locally to check sites on intranets
  • Supports cookie/form-based authentication
  • Free/Open source

There are many partial solutions out there, like FitNesse, Firefox's LinkChecker and the W3C link checker, but none of them do everything I need.

I would like to use this test with projects using a range of technologies and platforms, so the more portable the solution the better.

I realise this is no substitute for proper system testing, but it would be very useful if I had a convenient and automatable way of verifying that no part of the site was obviously broken.

like image 285
ctford Avatar asked Oct 20 '09 18:10

ctford


People also ask

How do I check for dead links?

First, log into your Google Analytics account and click on the Behavior tab. Then select “Site Content” and then “All Pages.” Make sure to set the evaluation period for the amount of time you want to look at. If you check for broken links monthly, set the period for the month since your last check.

What is link validation in testing?

Link validation pings the destination of a URL and tests for errors. This helps avoid broken and invalid links in your published document, and is especially useful for bloggers.


1 Answers

We use and really like Linkchecker:

http://wummel.github.io/linkchecker/

It's open-source, Python, command-line, internally deployable, and outputs to a variety of formats. The developer has been very helpful when we've contacted him with issues.

We have a Ruby script that queries our database of internal websites, kicks off LinkChecker with appropriate parameters for each site, and parses the XML that LinkChecker gives us to create a custom error report for each site in our CMS.

like image 120
Sean McMains Avatar answered Nov 18 '22 16:11

Sean McMains