Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get website title from c#

Tags:

c#

webrequest

I'm revisiting som old code of mine and have stumbled upon a method for getting the title of a website based on its url. It's not really what you would call a stable method as it often fails to produce a result and sometimes even produces incorrect results. Also, sometimes it fails to show some of the characters from the title as they are of an alternative encoding.

Does anyone have suggestions for improvements over this old version?

public static string SuggestTitle(string url, int timeout)
{
    WebResponse response = null;
    string line = string.Empty;

    try
    {
        WebRequest request = WebRequest.Create(url);
        request.Timeout = timeout;

        response = request.GetResponse();
        Stream streamReceive = response.GetResponseStream();
        Encoding encoding = System.Text.Encoding.GetEncoding("utf-8");
        StreamReader streamRead = new System.IO.StreamReader(streamReceive, encoding);

        while(streamRead.EndOfStream != true)
        {
            line = streamRead.ReadLine();
            if (line.Contains("<title>"))
            {
                line = line.Split(new char[] { '<', '>' })[2];
                break;
            }
        }
    }
    catch (Exception) { }
    finally
    {
        if (response != null)
        {
            response.Close();
        }
    }

    return line;
}

One final note - I would like the code to run faster as well, as it is blocking until the page as been fetched, so if I can get only the site header and not the entire page, it would be great.

like image 764
Morten Christiansen Avatar asked Nov 30 '08 20:11

Morten Christiansen


People also ask

How do I find the title of a Web page?

To find information such as title, author, or date on a webpage sometimes you need to do some digging around the website. Most of the information will be found in the header or the footer of the website. The header of a website will include the name of the website, and sub organization links or titles.

What are website titles?

A website title, or title tag, is an HTML element that specifies the content of a webpage. A website title is helpful for both users and search engines. An internet user needs a website title so they can see an accurate and concise description of a page's content before clicking on a link in the SERPs.


1 Answers

A simpler way to get the content:

WebClient x = new WebClient();
string source = x.DownloadString("http://www.singingeels.com/");

A simpler, more reliable way to get the title:

string title = Regex.Match(source, @"\<title\b[^>]*\>\s*(?<Title>[\s\S]*?)\</title\>",
    RegexOptions.IgnoreCase).Groups["Title"].Value;
like image 70
Timothy Khouri Avatar answered Sep 21 '22 12:09

Timothy Khouri