Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does one parse HTTP headers with libcurl?

I've been looking around and am quite surprised that there seems to be no means by which one can parse headers generically in libcurl (which seems to be the canonical C library for http these days).

The closest thing I've found was a mailing list post where someone suggested someone else search through the mailing list archives.

The only facility that is provided by libcurl via setopt is CURLOPT_HEADERFUNCTION which will feed the header responses a single line at a time.

This seems entirely too primitive considering headers can span multiple lines. Ideally this should be done once correctly (preferably by the library itself) and not left for the application developers to do continually reinvent.

Edit:

An example of the naïve thing not working, see the following gist with a libcurl code example and a properly formed http response that it can't parse: https://gist.github.com/762954

like image 328
Dustin Avatar asked Jan 02 '11 21:01

Dustin


People also ask

What is HTTP parsing?

The HTTP Parser interprets a byte stream according to the HTTP specification. This Parser is used by the HTTP Client Connector and by the HTTP Server Connector.

What is HTTP header analysis?

Every web-page delivered to a web-browser & search engine has both visible and invisible information. The visible information is the HTML . The invisible information is the "Header" information. The "Header" precedes the HTML (web-page code).

What is Curlopt_httpheader?

DESCRIPTION. Pass a pointer to a linked list of HTTP headers to pass to the server and/or proxy in your HTTP request. The same list can be used for both host and proxy requests!

What are headers in HTTP requests?

An HTTP header is a field of an HTTP request or response that passes additional context and metadata about the request or response. For example, a request message can use headers to indicate it's preferred media formats, while a response can use header to indicate the media format of the returned body.


2 Answers

Been over a year, so I think I'll close this as "manually." Or:

If you're having cURL problems, I feel bad for you son,

You've got multi-line headers and must parse each one.

like image 122
Dustin Avatar answered Sep 30 '22 13:09

Dustin


libcurl reads the entire header and sends it as a single complete line to the callback.

"Continued" HTTP header lines are not allowed in the HTTP 1.1 RFC 7230 family, and they were virtually extinct even before that.

like image 34
Daniel Stenberg Avatar answered Sep 30 '22 12:09

Daniel Stenberg