Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does PPP or Ethernet recover from errors?

Looking at the data-link level standards, such as PPP general frame format or Ethernet, it's not clear what happens if the checksum is invalid. How does the protocol know where the next frame begins?

Does it just scan for the next occurrence of "flag" (in the case of PPP)? If so, what happens if the packet payload just so happens to contain "flag" itself? My point is that, whether packet-framing or "length" fields are used, it's not clear how to recover from invalid packets where the "length" field could be corrupt or the "framing" bytes could just so happen to be part of the packet payload.

UPDATE: I found what I was looking for (which isn't strictly what I asked about) by looking up "GFP CRC-based framing". According to Communication networks

The GFP receiver synchronizes to the GFP frame boundary through a three-state process. The receiver is initially in the hunt state where it examines four bytes at a time to see if the CRC computed over the first two bytes equals the contents of the next two bytes. If no match is found the GFP moves forward by one byte as GFP assumes octet synchronous transmission given by the physical layer. When the receiver finds a match it moves to the pre-sync state. While in this intermediate state the receiver uses the tentative PLI (payload length indicator) field to determine the location of the next frame boundary. If a target number N of successful frame detection has been achieved, then the receiver moves into the sync state. The sync state is the normal state where the receiver examines each PLI, validates it using cHEC (core header error checking), extracts the payload, and proceeds to the next frame.

In short, each packet begins with "length" and "CRC(length)". There is no need to escape any characters and the packet length is known ahead of time.

There seems to be two major approaches to packet framing:

  • encoding schemes (bit/byte stuffing, Manchester encoding, 4b5b, 8b10b, etc)
  • unmodified data + checksum (GFP)

The former is safer, the latter is more efficient. Both are prone to errors if the payload just happens to contain a valid packet and line corruption causes the proceeding bytes to contain the "start of frame" byte sequence but that sounds highly improbable. It's difficult to find hard numbers for GFP's robustness, but a lot of modern protocols seem to use it so one can assume that they know what they're doing.

like image 311
Gili Avatar asked Jun 13 '09 13:06

Gili


People also ask

Does Ethernet do error correction?

Ethernet uses a sliding window to resend lost frames (error correction). It uses a sequence number to account for unordered frames (error correction). It has a frame check sequence to discard corrupted frames (error detection).

When CRC error is detected at Ethernet layer the frame is?

It will be retransmitted.

What is the use of frame check sequence in the Ethernet?

Frame Check Sequence (FCS) refers to extra bits added to the frame for error detection. It is used for HDLC error detection. It is 2 byte or 4-byte field that is used to detect errors in the address field, control field, and information field of frames transmitted across the network.

What is the overhead to send an IP packet using PPP?

To transmit 5000 bytes, this means 4 packets must be transmitted, which means an overhead of 72 bytes (18 * 4). Note that the overhead becomes bigger when you include things like the IP frame which contains a TCP frame. For PPP, as you've already shown, you can send 492 bytes per frame.


1 Answers

Both PPP and Ethernet have mechanisms for framing - that is, for breaking a stream of bits up into frames, in such a way that if a receiver loses track of what's what, it can pick up at the start of the next frame. These sit right at the bottom of the protocol stack; all the other details of the protocol are built on the idea of frames. In particular, the preamble, LCP, and FCS are at a higher level, and are not used to control framing.

PPP, over serial links like dialup, is framed using HDLC-like framing. A byte value of 0x7e, called a flag sequence, indicates the start of the frame. The frame continues until the next flag byte. Any occurrence of the flag byte in the content of the frame is escaped. Escaping is done by writing 0x7d, known as the control escape byte, followed by the byte to be escaped xor'd with 0x20. The flag sequence is escaped to 0x5e; the control escape itself also has to be escaped, to 0x5d. Other values can also be escaped if their presence would upset the modem. As a result, if a receiver loses synchronisation, it can just read and discard bytes until it sees a 0x7e, at which points it knows it's at the start of a frame again. The contents of the frame are then structured, containing some odd little fields that aren't really important, but are retained from an earlier IBM protocol, along with the PPP packet (called a protocol data unit, PDU), and also the frame check sequence (FCS).

Ethernet uses a logically similar approach, of having symbols which are recognisable as frame start and end markers rather than data, but rather than having reserved bytes plus an escape mechanism, it uses a coding scheme which is able to express special control symbols that are distinct from data bytes - a bit like using punctuation to break up a sequence of letters. The details of the system used vary with the speed.

Standard (10 Mb/s) ethernet is encoded using a thing called Manchester encoding, in which each bit to be transmitted is represented as two successive levels on the line, in such a way that there is always a transition between levels in every bit, which helps the receiver to stay synchronised. Frame boundaries are indicated by violating the encoding rule, leading to there being a bit with no transition (i read this in a book years ago, but can't find a citation online - i might be wrong about this). In effect, this system expands the binary code to three symbols - 0, 1, and violation.

Fast (100 Mb/s) ethernet uses a different coding scheme, based on a 5b/4b code, where groups of four data bits (nybbles) are represented as groups of five bits on the wire, and transmitted directly, without the Manchester scheme. The expansion to five bits lets the sixteen necessary patterns used be chosen to fulfil the requirement for frequent level transitions, again to help the receiver stay synchronised. However, there's still room to choose some extra symbols, which can be transmitted but don't correspond to data value, in effect, expanding the set of nybbles to twenty-four symbols - the nybbles 0 to F, and symbols called Q, I, J, K, T, R, S and H. Ethernet uses a JK pair to mark frame starts, and TR to mark frame ends.

Gigabit ethernet is similar to fast ethernet, but with a different coding scheme - the optical fibre versions use an 8b/10b code instead of the 5b/4b code, and the twisted-pair version uses some very complex quinary code arrangement which i don't really understand. Both approaches yield the same result, which is the ability to transmit either data bytes or one of a small set of additional special symbols, and those special symbols are used for framing.

On top of this basic framing structure, there is then a fixed preamble, followed by a frame delimiter, and some control fields of varying pointlessness (hello, LLC/SNAP!). Validity of these fields can be used to validate the frame, but they can't be used to define frames on their own.

like image 81
Tom Anderson Avatar answered Jan 04 '23 05:01

Tom Anderson