I am using Agility Pack to parse HTML, following this question What is the best way to parse html in C#? and I am getting great results :) The problem comes when I entre in some webpages were the results are based on my location, so for example like I am in Spain, I am getting the results for Spain region and I would like to change like if I were in England, how can it be done? I mean it is something I have to change in the user agent? ( I use as a user agent “Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:x.x.x) Gecko/20041107 Firefox/x.x)”
You could use the WebClient.DownloadString
method which allows you to set HTTP request headers to download the contents of the web page and then feed it to HTML agility Pack.
The UserAgent is not what controls the language. It is the Accept-Language
header. So for example:
using (var client = new WebClient())
{
client.Headers[HttpRequestHeader.AcceptLanguage] = "es-ES";
client.Headers[HttpRequestHeader.UserAgent] = "some user agent if you wish";
string html = client.DownloadString("http://example.com");
// feed the HTML to HTML Agility Pack
var doc = new HtmlDocument();
doc.LoadHtml(html);
// now do the parsing
}
But if the site uses IP based recognition to send you content in different languages there's not much you could do from the client side to change that.
location based search or pages are generally done via ip, or when you register, you tell the site where you are. you may want to look into an anon proxy within the country you would like to look like you are in.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With