I need to write a tool that will report broken URL's in C#. The URL should only reports broken if the user see's a 404 Error in the browser. I believe there might be tricks to handle web servers that do URL re-writing. Here's what I have. As you can see only some URL validate incorrectly.
string url = "";
// TEST CASES
//url = "http://newsroom.lds.org/ldsnewsroom/eng/news-releases-stories/local-churches-teach-how-to-plan-for-disasters"; //Prints "BROKEN", although this is getting re-written to good url below.
//url = "http://beta-newsroom.lds.org/article/local-churches-teach-how-to-plan-for-disasters"; // Prints "GOOD"
//url = "http://"; //Prints "BROKEN"
//url = "google.com"; //Prints "BROKEN" althought this should be good.
//url = "www.google.com"; //Prints "BROKEN" althought this should be good.
//url = "http://www.google.com"; //Prints "GOOD"
try
{
if (url != "")
{
WebRequest Irequest = WebRequest.Create(url);
WebResponse Iresponse = Irequest.GetResponse();
if (Iresponse != null)
{
_txbl.Text = "GOOD";
}
}
}
catch (Exception ex)
{
_txbl.Text = "BROKEN";
}
You can use the URLConstructor to check if a string is a valid URL. URLConstructor ( new URL(url) ) returns a newly created URL object defined by the URL parameters. A JavaScript TypeError exception is thrown if the given URL is not valid.
A typical URL could have the form http://www.example.com/index.html , which indicates a protocol ( http ), a hostname ( www.example.com ), and a file name ( index. html ).
Other easy way is use Node. JS DNS module. The DNS module provides a way of performing name resolutions, and with it you can verify if the url is valid or not.
Use URLUtil to validate the URL as below. It will return True if URL is valid and false if URL is invalid.
For one, Irequest
and Iresponse
shouldn't be named like that. They should just be webRequest
and webResponse
, or even just request
and response
. The capital "I" prefix is generally only used for interface naming, not for instance variables.
To do your URL validity checking, use UriBuilder
to get a Uri
. Then you should use HttpWebRequest
and HttpWebResponse
so that you can check the strongly typed status code response. Finally, you should be a bit more informative about what was broken.
Here's links to some of the additional .NET stuff I introduced:
Sample:
try
{
if (!string.IsNullOrEmpty(url))
{
UriBuilder uriBuilder = new UriBuilder(url);
HttpWebRequest request = HttpWebRequest.Create(uriBuilder.Uri);
HttpWebResponse response = request.GetResponse();
if (response.StatusCode == HttpStatusCode.NotFound)
{
_txbl.Text = "Broken - 404 Not Found";
}
if (response.StatusCode == HttpStatusCode.OK)
{
_txbl.Text = "URL appears to be good.";
}
else //There are a lot of other status codes you could check for...
{
_txbl.Text = string.Format("URL might be ok. Status: {0}.",
response.StatusCode.ToString());
}
}
}
catch (Exception ex)
{
_txbl.Text = string.Format("Broken- Other error: {0}", ex.Message);
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With