Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why don't we send binary around instead of text on http?

Tags:

http

binary

It seems that binary would be more compact and can be deserialized in a standard way, why is text used instead? It seems inefficient and web frameworks are forced to do nothing more than screwing around with strings. Why isn't there a binary standard? The web would be way faster and browsers would be able to load binary pages very fast.

If I were to start a binary protocol (HBP hyper binary protocol) what sort of standards would I define?

like image 527
Pierreten Avatar asked Dec 12 '09 02:12

Pierreten


People also ask

Can we send binary data over HTTP?

HTTP is perfectly capable of handling binary data: images are sent over HTTP all the time, and they're binary. People upload and download files of arbitrary data types all the time with no problem.

Is HTTP a binary?

Binary framing layer # x protocol, all HTTP/2 communication is split into smaller messages and frames, each of which is encoded in binary format. As a result, both client and server must use the new binary encoding mechanism to understand each other: an HTTP/1.

Why is http 2 a binary protocol?

HTTP/2 uses actual binary commands, i.e. individual bits and bytes which have no representation other than the bits and bytes that they are, and hence have no readable representation. (Note that HTTP/2 essentially wraps HTTP/1 in such a binary protocol, there's still " GET /foo " to be found somewhere in there.)

What is the difference between plain text and binary text?

a plain text is human readable, a binary file is usually unreadable by a human, since it's composed of printable and non-printable characters.


2 Answers

The HTTP protocol itself is readable as text. This is useful because you can telnet into any server at all and communicate with it.

Being text also allows you to easily watch HTTP communication with a program like wireshark. You can then diagnose the source of problems easily.

HTTP defines a way to work with resources. These resources do not need to be text, they can be images, or anything else. A text resource can be sent as binary by specifying the Content-Encoding header. Your resource type is specified via the Content-Type header.

So your question really only applies to the HTTP protocol itself, and not the payload which is the resources.

The web would be way faster and browsers would be able to load binary pages very fast.

I don't think this is true. The slowest part is probably connection establishment and slow TCP start.

Here is an example of how an HTTP response would send a text resource with a binary representation:

HTTP/1.1 200 OK
Server: Apache/2.0
Content-Encoding: gzip
Content-Length: 1533 Content-Type: text/html; charset=ISO-8859-1

like image 54
Brian R. Bondy Avatar answered Oct 11 '22 14:10

Brian R. Bondy


Text-based protocols have many important advantages:

  • Assuming you're using UTF-8 or another octet-oriented encoding, there are no byte order issues to contend with.
  • Getting everybody to agree on text-based schemas (such as those done in XML) is difficult enough. Imagine trying to get everybody to agree how many bits a number should be in the binary protocol.
    • Relatedly, imagine trying to get them to agree on a floating point representation. This isn't much of a hypothetical -- IBM threatened to derail the ECMAScript 5 standardization effort over floating point representation issues.
  • The web is text-based, and I don't just mean on an protocol level. Much of the content is text (at one time, almost ALL of the content was text). As such, modern programming languages have grown up around the idea that they are working with text, and that parsing binary formats is less important.
    • Not too long ago, I had to generate an obscure binary format in Python to interface with a legacy system. It turned out to be much more painful than I would have imagined. Parsing it would have been far, far worse.
  • A developer can't look at a stream of bytes and say "oh, my string length is missing" the way he can look at e.g. an XML document and say "oh, that element didn't get closed". This makes development and troubleshooting far easier.
  • Performance is overrated, and XML parsers are "fast enough" these days. If you're doing things that really have to have every last bit of performance squeezed out of the hardware, you're almost certainly not doing anything web-based, and will probably be constructing your own binary protocol to communicate between two applications you already control.
like image 27
Nicholas Knight Avatar answered Oct 11 '22 14:10

Nicholas Knight