Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do I need to secure my messages with checksum when using TCP?

Using TCP as the network protocol, I prefix the size (and potentially checksum?) of each message before sending the message through the wire. I'd like to know, does it make sense to calculate and transmit the checksum of the message, to ensure that the message will be delivered (if and when it will be delivered) unchanged, e.g. because of some network error. Currently I'm sending 4-byte size + 2-byte checksum (CRC-16) of the message, before sending the message itself. The other endpoint correctly identifies expected message length, reads it, and validates the checksum.

I know that TCP has internal packet validation mechanism, and I have a strong feeling that my message validation at application level is redundant, but I'm not sure and need your advice before I make a decision.

I'm in the process of developing the client-server application, with tens of thousands potential connections to the server daily. Even a single damaged byte in any of the messages might cause whole chain of incorrect messages exchanged, which is unacceptable (well, almost all client-server applications have the same requirements, don't they). So I want to be sure - can I safely trust TCP's internal reliability, or is it better to provide my own checksum validation mechanism. And I'm talking about small, two byte checksums (CRC-16), I'm not talking about digitally signing messages, etc. (And the system is developed in .Net (C#) using sockets, if that makes any difference).

like image 364
TX_ Avatar asked Oct 05 '13 13:10

TX_


People also ask

Is checksum mandatory in TCP?

TCP checksums are identical to UDP checksums, with the exception that checksums are mandatory with TCP (instead of being optional, as they are with UDP). Furthermore, their usage is mandatory for both the sending and receiving systems.

What is the purpose of the TCP checksum?

Checksum is a simple error detection mechanism to determine the integrity of the data transmitted over a network. Communication protocols like TCP/IP/UDP implement this scheme in order to determine whether the received data is corrupted along the network.

Is TCP checksum reliable?

TCP is reliable because it guarantees data delivery, in order, not that it guarantees uncorrupted data. A duplicate copy of data as a checksum would be better, but even that could have the same error as the original, and that would waste bandwidth and take too much time.

Is checksum needed?

A checksum is a string of numbers and letters that act as a fingerprint for a file against which later comparisons can be made to detect errors in the data. They are important because we use them to check files for integrity. Our digital preservation policy uses the UNESCO definition of integrity.


2 Answers

According to this paper "the checksum will fail to detect errors for roughly 1 in 16 million to 10 billion packets". Assuming a packet size of 1024 bytes, this amounts to one undetected error every 16 GB to 10 TB of network traffic.

Many protocols like HTTP, FTP, SMTP and probably many more rely on the checksums in the underlying layers. It is my belief that this practice is questionable given the above numbers.

Sidenote: The same is true for hard drives as well. Typical desktop drives have an error detection capability of 1 bit in 10 TB read. Read your 2 TB disk 5 times and on average you will suffer one incident of corruption.

To answer your question: if your tolerance for very rare, spurious failures is medium to high, don't bother checksumming. If you can't tolerate any corruption, add a checksum to your protocol.

like image 185
usr Avatar answered Oct 12 '22 13:10

usr


As far as TCP is concerned, like others have pointed out, it is not 100% reliable and some messages can get corrupt during transmission.

To keep integrity of messages you will have to use CRC at the application level.

However, if you are using SSL/TLS then you do not have to do CRC at the application level as it is already done. Messages exchanged over SSL/TLS are checked for integrity by the libraries. Almost all of the algorithms in SSL/TLS cipher suite perform message authentication. To know which algorithms does HMAC or doesn't or have more reliable one you have to see its name. The algorithm names have three parts. For example,

  "TLS_RSA_WITH_AES_256_GCM_SHA384" has following three parts;

  TSL_RSA     => Asymmetric algorithm for key exchange during initial handshake.
  AES_256_GCM => Symmetric algorithm for message encryption.
  SHA384      => HMAC for message integrity.

So in the above SSL/TLS algorithm the SHA384 is used for message authentication and that is why you do not have to do CRC in your application.

like image 36
Ahmed Avatar answered Oct 12 '22 12:10

Ahmed