Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Indy Project HttpClient Get() give code 500 on some URLs which work fine in web browsers?

Tags:

delphi

indy

I have several URLs which work just fine in all browsers, but if I try to get the page content using Get() of the Indy Http client, it returns error code 500, internal server error. This is with the latest Indy SVN build (4981).

Here is my example code. All that is needed for this is Delphi with Indy components and a form with a button and a memo.

procedure TForm1.Button1Click(Sender: TObject);
var HTTPCLIENT1: TIdHTTP;
begin
  try
   try
     HTTPCLIENT1 := TIdHTTP.Create(nil);
     Memo1.Clear;
     with HTTPCLIENT1 do
     begin
          HandleRedirects := True;
          Request.UserAgent := 'Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.7.3) Gecko/20040924 Epiphany/1.4.4 (Ubuntu)';
          Memo1.Text := Get('http://www.laredoute.fr/vente-machine-a-coudre-bernette-20-kit-couture--garantie-2-ans.aspx?productid=401225048&documentid=999999&categoryid=22918417&customertarget=0&offertype=0&prodcolor=1#pos=33_n_n_n_n_n_n&numberpage=2');
          Caption := ResponseText;
     end;
   except
     On e: Exception do
     begin
          Memo1.Lines.Add('Exception: '+e.Message);
     end;
   end;
  finally
     HTTPCLIENT1.Free;
  end;
end;

It's not a connection problem on my side, since 99% of URLs return 200 or 404, only few return 500, but every browser opens them fine in a second.

like image 930
Casady Avatar asked Apr 10 '13 20:04

Casady


1 Answers

That kind of failure usually suggests the GET request is malformed in some way, causing the server code to fail on its end. But without seeing what the webbrowser requests actually look like for comparison to TIdHTTP's requests, there is no way to know for sure what the server is not liking.

Update: what I see happening is that when a webbrowser requests the URL, the server sends back a 200 response immediately, however when TIdHTTP requests the URL, the server sends a 301 redirect to a new URL, which then sends a 302 redirect to an error page when TIdHTTP requests that URL, which then sends the 500 response when TIdHTTP requests that URL.

The two differences between a webbrowser request and the initial TIdHTTP request that would have an effect on a webserver are:

  1. the URL you are requesting with TIdHTTP includes an anchor tag at the end (everything after the # character - #pos=33_n_n_n_n_n_n&numberpage=2) which webbrowsers would normally strip out. Anchors are not actually part of URLs. They are meant for webbrowsers to use when locating spots within data that is retrieved from a URL.

  2. the user agent. Some web servers are sensitive to different user agents, and can send different responses to different types of user agents.

When I remove the anchor from the URL, TIdHTTP.Get() no longer crashes:

Memo1.Text := Get('http://www.laredoute.fr/vente-machine-a-coudre-bernette-20-kit-couture--garantie-2-ans.aspx?productid=401225048&documentid=999999&categoryid=22918417&customertarget=0&offertype=0&prodcolor=1');
like image 178
Remy Lebeau Avatar answered Oct 13 '22 14:10

Remy Lebeau