Why is the Pseudo header prepended to the UDP datagram for the computation of the UDP checksum? What's the rational behind this?
"The purpose of using a pseudo-header is to verify that the UDP datagram has reached its correct destination. The key to understanding the pseudo-header lies in realizing that the correct destination consists of a specific machine and a specific protocol port within that machine.
The UDP pseudo header consists of the Source IP Address field, the Destination IP Address field, an Unused field set to 0, the Protocol field for UDP (17 or 0x11), and the UDP Length field. When sending a UDP message, UDP is aware of all of these values.
So, a pseudo-header emerged in TCP to keep the end-to-end feature of TCP and avoid replication of data available in the IP header. The pseudo-header consists of parts of the IP header. It covers relevant fields of the IP header that are static (do not change in the routing of packets).
What is a pseudo header, and when is one used? UDP's pseudo header has the IP addresses, protocol (type) and UDP length field from the IP header. It is used by UDP when finding the checksum and enables UDP to discard packets that may have been misdelivered due to an error.
The nearest you will get to an answer "straight from the horse's mouth", is from David P. Reed at the following link.
http://www.postel.org/pipermail/end2end-interest/2005-February/004616.html
The short version of the answer is, "the pseudo header exists for historical reasons".
Originally, TCP/IP was a single monolithic protocol (called just TCP). When they decided to split it up into TCP and IP (and others), they didn't separate the two all that cleanly: the IP addresses were still thought of as part of TCP, but they were just "inherited" from the IP layer rather than repeated in the TCP header. The reason why the TCP checksum operates over parts of the IP header (including the IP addresses) is because they intended to use cryptography to encrypt and authenticate the TCP payload, and they wanted the IP addresses and other TCP parameters in the pseudo header to be protected by the authentication code. That would make it infeasible for a man in the middle to tamper with the IP source and destination addresses: intermediate routers wouldn't notice the tampering, but the TCP end-point would when it attempted to verify the signature.
For various reasons, none of that grand cryptographic plan came to pass, but the TCP checksum which took its place still operates over the pseudo header as though it were a useful thing to do. Yes, it gives you a teensy bit of extra protection against random errors, but that's not why it exists. Frankly, we'd be better off without it: the coupling between TCP and IP means that you have to redefine TCP when you change IP. Thus, the definition of IPv6 includes a new definition for the TCP and UDP pseudo header (see RFC 2460, s8.1). Why the IPv6 designers chose to perpetuate this coupling rather than take the chance to abolish it is beyond me.
From the TCP or UDP point of view, the packet does not contain IP addresses. (IP being the layer beneath them.)
Thus, to do a proper checksum, a "pseudo header" is included. It's "pseudo", because it is not actaully part of the UDP datagram. It contains the most important parts of the IP header, that is, source and destination address, protocol number and data length.
This is to ensure that the UDP checksum takes into account these fields.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With