For my assignment, I am to do a encode and decode for huffman trees. I have a problem creating my tree, and I am stuck.
Don't mind the print statements - they are just for me to test and see what the output is when my function runs.
For the first for loop, I got all the values and index from the text file I used in my main block for testing.
In the second for loop I inserted all the stuff into the priority queue.
I am so stuck about where to go next - I'm trying to make nodes, but I am confused about how to progress. Can someone tell me if I'm doing this right?
def _create_code(self, frequencies):
'''(HuffmanCoder, sequence(int)) -> NoneType
iterate over index into the sequence keeping it 256 elements long, '''
#fix docstring
p = PriorityQueue()
print frequencies
index = 0
for value in frequencies:
if value != 0:
print value #priority
print index #elm
print '-----------'
index = index + 1
for i in range(len(frequencies)):
if frequencies[i] != 0:
p.insert(i, frequencies[i])
print i,frequencies[i]
if p.is_empty():
a = p.get_min()
b = p.get_min()
n1 = self.HuffmanNode(None, None, a)
n2 = self.HuffmanNode(None, None, b)
print a, b, n1, n2
while not p.is_empty():
p.get_min()
I manually inserted the first two to start my tree, is that correct?
How do I keep going? I know the idea of it, just code-wise I am very stuck.
This is using python by the way. I tried looking at Wikipedia, I know the steps, I just need help on code and how I should keep going, thanks!
The HuffmanNode comes from this nested class:
class HuffmanNode(object):
def __init__(self, left=None, right=None, root=None):
self.left = left
self.right = right
self.root = root
The Huffman code for each letter is derived from a full binary tree called the Huffman coding tree, or simply the Huffman tree. Each leaf of the Huffman tree corresponds to a letter, and we define the weight of the leaf node to be the weight (frequency) of its associated letter.
Huffman Decoding AlgorithmStart from the root node. If the current bit in the given data is 0,then move to the left node of the tree. If the current bit in the given data is 1,then move to the right node of the tree. During the traversal if leaf node is encountered then print character of that leaf node.
A Huffman coding tree or Huffman tree is a full binary tree in which each leaf of the tree corresponds to a letter in the given alphabet. Define: the weighted path length of a leaf to be its weight times its depth.
The Huffman algorithm in Wikipedia tells you exactly how to create the node tree, so your program can be based on that algorithm, or another like it. Here is a Python program with comments showing the corresponding wikipedia algorithm step. The test data is frequencies of the letters of the alphabet in English text.
Once the node tree is created, you need to walk it down to assign Huffman codes to each symbol in your dataset. Since this is homework, that step is up to you, but a recursive algorithm is the simplest and most natural way to handle it. It's only six more lines of code.
import queue
class HuffmanNode(object):
def __init__(self, left=None, right=None, root=None):
self.left = left
self.right = right
self.root = root # Why? Not needed for anything.
def children(self):
return((self.left, self.right))
freq = [
(8.167, 'a'), (1.492, 'b'), (2.782, 'c'), (4.253, 'd'),
(12.702, 'e'),(2.228, 'f'), (2.015, 'g'), (6.094, 'h'),
(6.966, 'i'), (0.153, 'j'), (0.747, 'k'), (4.025, 'l'),
(2.406, 'm'), (6.749, 'n'), (7.507, 'o'), (1.929, 'p'),
(0.095, 'q'), (5.987, 'r'), (6.327, 's'), (9.056, 't'),
(2.758, 'u'), (1.037, 'v'), (2.365, 'w'), (0.150, 'x'),
(1.974, 'y'), (0.074, 'z') ]
def create_tree(frequencies):
p = queue.PriorityQueue()
for value in frequencies: # 1. Create a leaf node for each symbol
p.put(value) # and add it to the priority queue
while p.qsize() > 1: # 2. While there is more than one node
l, r = p.get(), p.get() # 2a. remove two highest nodes
node = HuffmanNode(l, r) # 2b. create internal node with children
p.put((l[0]+r[0], node)) # 2c. add new node to queue
return p.get() # 3. tree is complete - return root node
node = create_tree(freq)
print(node)
# Recursively walk the tree down to the leaves,
# assigning a code value to each symbol
def walk_tree(node, prefix="", code={}):
return(code)
code = walk_tree(node)
for i in sorted(freq, reverse=True):
print(i[1], '{:6.2f}'.format(i[0]), code[i[1]])
When run on the alphabet data, the resulting Huffman codes are:
e 12.70 100
t 9.06 000
a 8.17 1110
o 7.51 1101
i 6.97 1011
n 6.75 1010
s 6.33 0111
h 6.09 0110
r 5.99 0101
d 4.25 11111
l 4.03 11110
c 2.78 01001
u 2.76 01000
m 2.41 00111
w 2.37 00110
f 2.23 00100
g 2.02 110011
y 1.97 110010
p 1.93 110001
b 1.49 110000
v 1.04 001010
k 0.75 0010111
j 0.15 001011011
x 0.15 001011010
q 0.10 001011001
z 0.07 001011000
One more solution returning a dictionary {label:code}
and a recursive dictionary tree
containing the resulting graph. The input vals
is in form of dictionary {label:freq}
:
def assign_code(nodes, label, result, prefix = ''):
childs = nodes[label]
tree = {}
if len(childs) == 2:
tree['0'] = assign_code(nodes, childs[0], result, prefix+'0')
tree['1'] = assign_code(nodes, childs[1], result, prefix+'1')
return tree
else:
result[label] = prefix
return label
def Huffman_code(_vals):
vals = _vals.copy()
nodes = {}
for n in vals.keys(): # leafs initialization
nodes[n] = []
while len(vals) > 1: # binary tree creation
s_vals = sorted(vals.items(), key=lambda x:x[1])
a1 = s_vals[0][0]
a2 = s_vals[1][0]
vals[a1+a2] = vals.pop(a1) + vals.pop(a2)
nodes[a1+a2] = [a1, a2]
code = {}
root = a1+a2
tree = {}
tree = assign_code(nodes, root, code) # assignment of the code for the given binary tree
return code, tree
This can be used as:
freq = [
(8.167, 'a'), (1.492, 'b'), (2.782, 'c'), (4.253, 'd'),
(12.702, 'e'),(2.228, 'f'), (2.015, 'g'), (6.094, 'h'),
(6.966, 'i'), (0.153, 'j'), (0.747, 'k'), (4.025, 'l'),
(2.406, 'm'), (6.749, 'n'), (7.507, 'o'), (1.929, 'p'),
(0.095, 'q'), (5.987, 'r'), (6.327, 's'), (9.056, 't'),
(2.758, 'u'), (1.037, 'v'), (2.365, 'w'), (0.150, 'x'),
(1.974, 'y'), (0.074, 'z') ]
vals = {l:v for (v,l) in freq}
code, tree = Huffman_code(vals)
text = 'hello' # text to encode
encoded = ''.join([code[t] for t in text])
print('Encoded text:',encoded)
decoded = []
i = 0
while i < len(encoded): # decoding using the binary graph
ch = encoded[i]
act = tree[ch]
while not isinstance(act, str):
i += 1
ch = encoded[i]
act = act[ch]
decoded.append(act)
i += 1
print('Decoded text:',''.join(decoded))
One can visualize the tree with Graphviz as:
The figure was generated by the following script as (Graphviz is needed):
def draw_tree(tree, prefix = ''):
if isinstance(tree, str):
descr = 'N%s [label="%s:%s", fontcolor=blue, fontsize=16, width=2, shape=box];\n'%(prefix, tree, prefix)
else: # Node description
descr = 'N%s [label="%s"];\n'%(prefix, prefix)
for child in tree.keys():
descr += draw_tree(tree[child], prefix = prefix+child)
descr += 'N%s -> N%s;\n'%(prefix,prefix+child)
return descr
import subprocess
with open('graph.dot','w') as f:
f.write('digraph G {\n')
f.write(draw_tree(tree))
f.write('}')
subprocess.call('dot -Tpng graph.dot -o graph.png', shell=True)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With