I'm using Delphi and I'm trying to get the source from web pages.
My problem is that I get different sourcecode when I use Indy (idHttp) or Clever Components (clHttp) instead of IE and/or Google Chrome.
Is there any way I could retrieve a web page source with Delphi exactly the same way as the ones showing by the browsers ?
It's probably because the control is sending a User Agent string which is different than the ones used by IE or Chrome. So in other words, the SERVER is sending back a different source than it does for IE or Chrome.
For example, In TIdDHTTP, set:
Request.Accept=*/*
Request.CacheControl=no-cache
Request.Connection=Keep-Alive
Request.ContentType=application/x-www-form-urlencoded
Request.AcceptEncoding=gzip, deflate
Request.UserAgent=Mozilla/4.0 (compatible; MSIE 6.0; Win32)
Request.Host=(web site name)
Basically, requesting the page from the component is doing the same as the browser does. Only the browser might do several more requests and other activities eg. JavaScript that might change the DOM.
Try switching off JavaScript and compare again. If your familiar with chrome developer tools, check the raw input from the first HTTP get and compare that. If still different, then modify your request to make it identical to the chrome request.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With