I'm working with an HTTP request tool (similar to cURL) and having an issue with the server response. Either that or my understanding of the RFC for HTTP 1.1 and chunked data.
What I'm seeing is chunked data should be in this format:
4\r\n
Wiki\r\n
5\r\n
pedia\r\n
e\r\n
in\r\n\r\nchunks.\r\n
0\r\n
\r\n
what I'm actually seeing is the following:
4\r\n
Wiki\r\n
5\r\n
pedia\r\n
e\r\n
in\r\n\r\nchunks.\r\n
0
In other words, the few servers I've tested with send no more data after the 0.. not CRLF, much less CRLFCRLF.
How are we supposed to know it's the end of the chunked data without the proper format of the chunked tags? Timeouts happen looking for the CRLFs after the 0, and that's no sufficient.
Yes, it violates standard. But we want to be compatible with all possible http servers and clients, so we have to understand a way how it can be violated.
Chunked is used often in a way of content streaming over http 1.1 protocol. Standard ask to end content with additional CRLF
. So we can see the following pseudo code:
def stream(endpoint)
Socket.open(endpoint) do |socket|
sleep 10
more_data do |data|
print data.length.to_s(16)
print data
print "CRLF"
end
end
print "CRLF"
end
But the right code is the following:
def stream(endpoint)
Socket.open(endpoint) do |socket|
sleep 10
more_data do |data|
print data.length.to_s(16)
print data
print "CRLF"
end
end
ensure
print "CRLF"
end
It means that after input socket interruption of any other exception wrong version of method won't be able to print additional "CRLF" to output socket.
How are we supposed to know it's the end of the chunked data without the proper format of the chunked tags? Timeouts happen looking for the CRLFs after the 0, and that's no sufficient.
Many implementations ignores this violation because they don't need to know the size of content. They just tries to receive as much data as possible before socket will be closed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With