Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compressing a sinewave table

I have a large array with 1024 entries that have 7 bit values in range(14, 86)

This means there are multiple range of indices that have the same value.

For example,

consider the index range 741 to 795. It maps to 14
consider the index range 721 to 740. It maps to 15
consider the index range 796 to 815. It maps to 15

I want to feed this map to a python program that would spew out the following:

if((index >= 741) and (index <= 795)) return 14;
if((index >= 721) and (index <= 740)) return 15;
if((index >= 796) and (index <= 815)) return 15;

Some code to groupby mapped value is ready but I am having difficulty coding up an expression using pairwise.

Anyone has done something similar before?

I have uploaded the dataset in two forms:

Usual, ordered by index.

Grouped by mapped value.

like image 525
PoorLuzer Avatar asked Aug 06 '11 03:08

PoorLuzer


2 Answers

If you don't mind slightly different values due to rounding, I can compress that really well for you.

from math import pi, sin
interval=2*pi/1024
sinval=lambda i:int(round(sin(i*interval)*36))+50

Here is code to actually do what you want; it works with

vals = sorted((sinval(i), i) for i in range(1024))

as test data. You would need to switch the order of val and index in the for loop here if you've got indexes in the first column.

ranges, oldval, oldidx = [[0, 0]], 0, 0
for val, index in vals:
    if not (val == oldval and index == oldidx + 1):
        ranges[-1].append(oldidx)
        ranges.append([val, index])
    oldval, oldidx = val, index
ranges[-1].append(oldidx)
ranges.pop(0)
ifs = ('if((index >= {1}) and (index <= {2})) return {0};\n'.format(val, start, end)
            for val, start, end in ranges)
print ''.join(ifs)

Edit: Whoops, I was missing a line. Fixed. Also, you're multiplier was actually 36 not 35, I must have rounded (14, 86) to (15, 85) in my head.

Edit 2: To show you how to store only a quarter of the table.

from math import pi, sin

full = 1024
half = 512
quarter = 256
mag = 72
offset = 50

interval = 2 * pi / full

def sinval(i):
    return int(round(sin(i * interval) * (mag // 2))) + offset

vals = [sinval(i) for i in range(quarter)]

def sintable(i):
    if  i >= half + quarter:
        return 2 * offset - vals[full - i - 1]
    elif  i >= half:
        return 2 * offset - vals[i - half]
    elif i >= quarter:
        return vals[half - i - 1]
    else:
        return vals[i]

for i in range(full):
    assert -1 <= sinval(i) - sintable(i) <= 1

If you subtract the offset out of the table just make the first two -vals[...] instead.

Also, the compare at the bottom is fuzzy because I get 72 off-by-one errors for this. This is simply because your values are rounded to integers; they're all places that you're halfway in between two values so there is very little decrease in accuracy.

like image 188
agf Avatar answered Oct 13 '22 15:10

agf


After closing, I belatedly found this solution "What's the most Pythonic way to identify consecutive duplicates in a list?".


NB: with a periodic fn like sine, you can get by by only storing a quarter (i.e. 256 values) or half of the table, then perform a little (fixed-point) arithmetic on the index at lookup time. As I commented, if you further don't store the offset of +50, you need one bit less, at the cost of one integer addition after lookup time. Hence, 79% compression easily achievable. RLE will give you more. Even if the fn has noise, you can still get decent compression with this general approach.

As agf pointed out, your f(n) = 50 + 36*sin(72*pi*n/1024) = 50 + g(n), say.

So tabulate the 256 values of g(n) = 36*sin(72*pi*n/1024), only for the range n=0..255

Then f(n) is easily computed by:

if 0 <= n < 256, f(n) = 50 + g(n)
if 256 <= n < 512, f(n) = 50 + g(511-n)
if 512 <= n < 768, f(n) = 50 - g(n-512)
if 768 <= n < 1024, f(n) = 50 - g(1023-n)

Anyway here's a general table compressor solution which will generate (istart,iend,value) triples.

I knocked my head off how to do this more Pythonically using list comprehensions and itertools.takewhile() ; needs polishing.

#import itertools

table_="""
    0       50
    1       50
    ...
    1021    49
    1022    50
    1023    50""".split()

# Convert values to int. Throw away the indices - will recover them with enumerate()
table = [int(x) for x in table_[1::2]]

compressed_table = []
istart = 0
for i,v in enumerate(table):
    if v != table[i-1]:
        iend = i-1
        compressed_table.append((istart,iend,table[i-1]))
        istart = i
    else:
        continue # skip identical values
# Slightly ugly: append the last value, when the iterator was exhausted
compressed_table.append((istart,i,table[i]))

(NB I started the table-compressor approach before agf changed his approach... was trying to get an itertools or list-comprehension solution)

like image 34
smci Avatar answered Oct 13 '22 16:10

smci