Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What techniques can you use to encode data on a lossy one-way channel?

Imagine you have a channel of communication that is inherently lossy and one-way. That is, there is some inherent noise that is impossible to remove that causes, say, random bits to be toggled. Also imagine that it is one way - you cannot request retransmission.

But you need to send data over it regardless. What techniques can you use to send numbers and text over that channel?

  1. Is it possible to encode numbers so that even with random bit twiddling they can still be interpreted as values close to the original (lossy transmittion)?

  2. Is there a way to send a string of characters (ASCII, say) in a lossless fashion?

This is just for fun. I know you can use morse code or any very low frequency binary communication. I know about parity bits and checksums to detect errors and retrying. I know that you might as well use an analog signal. I'm just curious if there are any interesting computer-sciency techniques to send this stuff over a lossy channel.

like image 608
Frank Krueger Avatar asked Aug 05 '09 06:08

Frank Krueger


3 Answers

Depending on some details that you don't supply about your lossy channel, I would recommend, first using a Gray code to ensure that single-bit errors result in small differences (to cover your desire for loss mitigation in lossy transmission), and then possibly also encoding the resulting stream with some "lossless" (==tries to be loss-less;-) encoding.

Reed-Solomon and variants thereof are particularly good if your noise episodes are prone to occur in small bursts (several bit mistakes within, say, a single byte), which should interoperate well with Gray coding (since multi-bit mistakes are the killers for the "loss mitigation" aspect of Gray, designed to degrade gracefully for single-bit errors on the wire). That's because R-S is intrinsically a block scheme, and multiple errors within one block are basically the same as a single error in it, from R-S's point of view;-).

R-S is particularly awesome if many of the errors are erasures -- to put it simply, an erasure is a symbol that has most probably been mangled in transmission, BUT for which you DO know the crucial fact that it HAS been mangled. The physical layer, depending on how it's designed, can often have hints about that fact, and if there's a way for it to inform the higher layers, that can be of crucial help. Let me explain erasures a bit...:

Say for a simplified example that a 0 is sent as a level of -1 volt and a 1 is send as a level of +1 volt (wrt some reference wave), but there's noise (physical noise can often be well-modeled, ask any competent communication engineer;-); depending on the noise model the decoding might be that anything -0.7 V and down is considered a 0 bit, anything +0.7 V and up is considered a 1 bit, anything in-between is considered an erasure, i.e., the higher layer is told that the bit in question was probably mangled in transmission and should therefore be disregarded. (I sometimes give this as one example of my thesis that sometimes abstractions SHOULD "leak" -- in a controlled and architected way: the Martelli corollary to Spolsky's Law of Leaky Abstractions!-).

A R-S code with any given redundancy ratio can be about twice as effective at correcting erasures (errors the decoder is told about) as it can be at correcting otherwise-unknown errors -- it's also possible to mix both aspects, correcting both some erasures AND some otherwise-unknown errors.

As the cherry on top, custom R-S codes can be (reasonably easily) designed and tailored to reduce the probability of uncorrected errors to below any required threshold θ given a precise model of the physical channel's characteristics in terms of both erasures and undetected errors (including both probability and burstiness).

I wouldn't call this whole area a "computer-sciency" one, actually: back when I graduated (MSEE, 30 years ago), I was mostly trying to avoid "CS" stuff in favor of chip design, system design, advanced radio systems, &c -- yet I was taught this stuff (well, the subset that was already within the realm of practical engineering use;-) pretty well.

And, just to confirm that things haven't changed all that much in one generation: my daughter just got her MS in telecom engineering (strictly focusing on advanced radio systems) -- she can't design just about any serious program, algorithm, or data structure (though she did just fine in the mandatory courses on C and Java, there was absolutely no CS depth in those courses, nor elsewhere in her curriculum -- her daily working language is matlab...!-) -- yet she knows more about information and coding theory than I ever learned, and that's before any PhD level study (she's staying for her PhD, but that hasn't yet begun).

So, I claim these fields are more EE-y than CS-y (though of course the boundaries are ever fuzzy -- witness the fact that after a few years designing chips I ended up as a SW guy more or less by accident, and so did a lot of my contemporaries;-).

like image 153
Alex Martelli Avatar answered Oct 02 '22 22:10

Alex Martelli


This question is the subject of coding theory.

like image 43
starblue Avatar answered Oct 02 '22 23:10

starblue


Probably one of the better-known methods is to use Hamming code. It might not be the best way of correcting errors on large scales, but it's incredibly simple to understand.

like image 45
Ryan Fox Avatar answered Oct 02 '22 23:10

Ryan Fox