Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Proxy works locally but fails when uploaded to webhost

I have spent a good time now on configuring my proxy. At the moment I use a service called proxybonanza. They supply me with a proxy which I use to fetch webpages.

I'm using HTMLAGILITYPACK

Now if I run my code without a proxy there's no problem locally or when uploaded to webhost server.

If I decide to use the proxy, it takes somewhat longer but it stills works locally.

 If I publish my solution to, to my webhost I get a SocketException (0x274c) 

 "A connection attempt failed because the connected party did not properly respond
 after a period of time, or established connection failed because connected host has
 failed to respond 38.69.197.71:45623"

I have been debugging this for a long time.

My app.config has two entries that are relevant for this

httpWebRequest useUnsafeHeaderParsing="true" 
httpRuntime executionTimeout="180"

That helped me through a couple of problems.

Now this is my C# code.

 HtmlWeb htmlweb = new HtmlWeb();
 htmlweb.PreRequest = new HtmlAgilityPack.HtmlWeb.PreRequestHandler(OnPreRequest);
 HtmlDocument htmldoc = htmlweb.Load(@"http://www.websitetofetch.com,
                                         "IP", port, "username", "password");

 //This is the preRequest config
 static bool OnPreRequest(HttpWebRequest request)
    {
      request.KeepAlive = false;
        request.Timeout = 100000;
        request.ReadWriteTimeout = 1000000; 
        request.ProtocolVersion = HttpVersion.Version10;
        return true; // ok, go on
    }

What am I doing wrong? I have enabled the tracer in the appconfig, but I don't get a log on my webhost...?

  Log stuff from app.config

 <system.diagnostics>
 <sources>
  <source name="System.ServiceModel.MessageLogging" switchValue="Warning, ActivityTracing" >
     <listeners>
        <add name="ServiceModelTraceListener"/>
     </listeners>
  </source>


  <source name="System.ServiceModel" switchValue="Verbose,ActivityTracing">
     <listeners>
        <add name="ServiceModelTraceListener"/>
     </listeners>
     </source>
     <source name="System.Runtime.Serialization" switchValue="Verbose,ActivityTracing">
        <listeners>
           <add name="ServiceModelTraceListener"/>
        </listeners>
     </source>
   </sources>
   <sharedListeners>
   <add initializeData="App_tracelog.svclog"
     type="System.Diagnostics.XmlWriterTraceListener, System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089"
     name="ServiceModelTraceListener" traceOutputOptions="Timestamp"/>
 </sharedListeners>
 </system.diagnostics>

Can anyone spot the problem I have these setting on and off like a thousand times..

  request.KeepAlive = false;
  System.Net.ServicePointManager.Expect100Continue = false;

Carl

like image 535
8bitcat Avatar asked Oct 03 '12 22:10

8bitcat


Video Answer


1 Answers

Try downloading the page as a string first, then passing it to HtmlAgilityPack. This will let you isolate errors that happen during the download process from those that happen during the html parsing process. If you have an issue with proxybonanza (see end of post) you will be able to isolate that issue from a HtmlAgilityPack configuration issue.

Download page using WebClient:

// Download page
System.Net.WebClient client = new System.Net.WebClient();
client.Proxy = new System.Net.WebProxy("{proxy address and port}");
string html = client.DownloadString("http://example.com");

// Process result
HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
htmlDoc.LoadHtml(html);

If you want more control over the request, use System.Net.HttpWebRequest:

// Create request
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://example.com/");

// Apply settings (including proxy)
request.Proxy = new WebProxy("{proxy address and port}");
request.KeepAlive = false;
request.Timeout = 100000;
request.ReadWriteTimeout = 1000000;
request.ProtocolVersion = HttpVersion.Version10;

// Get response
try
{
    HttpWebResponse response = (HttpWebResponse)request.GetResponse();
    Stream stream = response.GetResponseStream();
    StreamReader reader = new StreamReader(stream);
    string html = reader.ReadToEnd();
}
catch (WebException)
{
    // Handle web exceptions
}
catch (Exception)
{
    // Handle other exceptions
}

// Process result
HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
htmlDoc.LoadHtml(html);

Also, ensure that your proxy provider (proxybonanza) allows access from your production environment to your proxies. Most providers will limit access to the proxies to certain IP addresses. They may have allowed access to the external IP of the network where you are running locally but NOT the external IP address of your production environment.

like image 164
Steve Konves Avatar answered Oct 12 '22 23:10

Steve Konves