I'm revisiting som old code of mine and have stumbled upon a method for getting the title of a website based on its url. It's not really what you would call a stable method as it often fails to produce a result and sometimes even produces incorrect results. Also, sometimes it fails to show some of the characters from the title as they are of an alternative encoding.
Does anyone have suggestions for improvements over this old version?
public static string SuggestTitle(string url, int timeout)
{
WebResponse response = null;
string line = string.Empty;
try
{
WebRequest request = WebRequest.Create(url);
request.Timeout = timeout;
response = request.GetResponse();
Stream streamReceive = response.GetResponseStream();
Encoding encoding = System.Text.Encoding.GetEncoding("utf-8");
StreamReader streamRead = new System.IO.StreamReader(streamReceive, encoding);
while(streamRead.EndOfStream != true)
{
line = streamRead.ReadLine();
if (line.Contains("<title>"))
{
line = line.Split(new char[] { '<', '>' })[2];
break;
}
}
}
catch (Exception) { }
finally
{
if (response != null)
{
response.Close();
}
}
return line;
}
One final note - I would like the code to run faster as well, as it is blocking until the page as been fetched, so if I can get only the site header and not the entire page, it would be great.
To find information such as title, author, or date on a webpage sometimes you need to do some digging around the website. Most of the information will be found in the header or the footer of the website. The header of a website will include the name of the website, and sub organization links or titles.
A website title, or title tag, is an HTML element that specifies the content of a webpage. A website title is helpful for both users and search engines. An internet user needs a website title so they can see an accurate and concise description of a page's content before clicking on a link in the SERPs.
A simpler way to get the content:
WebClient x = new WebClient();
string source = x.DownloadString("http://www.singingeels.com/");
A simpler, more reliable way to get the title:
string title = Regex.Match(source, @"\<title\b[^>]*\>\s*(?<Title>[\s\S]*?)\</title\>",
RegexOptions.IgnoreCase).Groups["Title"].Value;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With