Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C: how to read a webpage

Tags:

c

curl

sockets

I'm trying to open a connection to a webpage (e.g. www.google.com) via localhost, port 80.

How can I do this programatically in C? I want get all the HTML headers and not just the content ;(

I hope someone can help.

Many thanks in advance,

like image 581
Eamorr Avatar asked Oct 13 '25 07:10

Eamorr


2 Answers

Here is some example code on how to do this with libcurl:

http://curl.haxx.se/libcurl/c/getinmemory.html

There is another one right there, that shows you how to get some header data:

http://curl.haxx.se/libcurl/c/getinfo.html

These examples and many others are available as part of the libcurl distribution. It should more than get you started.

like image 103
skorks Avatar answered Oct 14 '25 21:10

skorks


Summarized process:

  • DNS resolution for the hostname (using getaddrinfo())
  • Open a stream socket (TCP) to the resolved IP address and port
  • Send GET request (see protocol in: http://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol)

    GET /index.html HTTP/1.1 Host: www.example.com

  • Read headers - Terminated by \r\n\r\n

  • Read body
  • Close socket
like image 39
eyalm Avatar answered Oct 14 '25 21:10

eyalm