I'm looking for a C/C++ library that will work on Windows and Linux which will allow me to asychronously query multiple webservers (1000's per minute) for page headers and download web pages in much the same way WinHttp library does in a windows environment.
So far I've come across libCurl which seems to do what I want but the asychronous aspect looks suspect.
How easy do you think it would be to bypass the idea of using a library and write something simple from scratch based on sockets that could achieve this?
Any comments, advice or suggestions would be very welcomed.
Addendum:- Any body have comments about doing this with libCurl, I said the asychronous aspect may look suspect but does anyone have any experience of of it?
Try libevent HTTP routines. You create an HTTP connection and provide a callback which is invoked when a response arrives (or timeout event fires).
Updated: I built a distributed HTTP connection-throttling proxy and used both th e client and server portions within the same daemon, all on a single thread. It worked great.
If you're writing an HTTP client, libevent should be a good fit. The only limitation I ran into with the server side was lack of configuration options -- the API is a bit sparse if you want to start adding more advanced features; which I expected since it was never intended to replace general-purpose web servers like Apache, Nginx. For example I patched it to add a custom subroutine to limit the overall size of an inbound HTTP request (e.g. close connection after 10MB read). The code is very well-written and the patch was easy to implement.
I was using the 1.3.x branch; the 2.x branch has some serious performance improvements over the older releases.
Code example: Found a few minutes and wrote a quick example. This should get you acquainted with the libevent programming style:
#include <stdio.h> #include <event.h> #include <evhttp.h> void _reqhandler(struct evhttp_request *req, void *state) { printf("in _reqhandler. state == %s\n", (char *) state); if (req == NULL) { printf("timed out!\n"); } else if (req->response_code == 0) { printf("connection refused!\n"); } else if (req->response_code != 200) { printf("error: %u %s\n", req->response_code, req->response_code_line); } else { printf("success: %u %s\n", req->response_code, req->response_code_line); } event_loopexit(NULL); } int main(int argc, char *argv[]) { const char *state = "misc. state you can pass as argument to your handler"; const char *addr = "127.0.0.1"; unsigned int port = 80; struct evhttp_connection *conn; struct evhttp_request *req; printf("initializing libevent subsystem..\n"); event_init(); conn = evhttp_connection_new(addr, port); evhttp_connection_set_timeout(conn, 5); req = evhttp_request_new(_reqhandler, (void *)state); evhttp_add_header(req->output_headers, "Host", addr); evhttp_add_header(req->output_headers, "Content-Length", "0"); evhttp_make_request(conn, req, EVHTTP_REQ_GET, "/"); printf("starting event loop..\n"); event_dispatch(); return 0; }
Compile and run:
% gcc -o foo foo.c -levent % ./foo initializing libevent subsystem.. starting event loop.. in _reqhandler. state == misc. state you can pass as argument to your handler success: 200 OK
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With