Binary trees (and hence ordered forests) can be represented as binary strings. The binary string is obtained by traversing a binary tree in preorder, recording a 1 for every node and a 0 for every empty subtree (null link).
This means that if I'm given a binary tree, I can do a preorder traversal and produce a binary sequence representation.
Is the opposite also possible? If I'm given this binary sequence 11011000101101010001
, can I draw the binary tree?
Yes you can.
Label the internal nodes as a
and the external ones as b
; of course you can assume a
as 0
and b
as 1
(and vice-versa). But I think is easier to distinguish letters than numbers (although this matter of "taste").
If you don't know what are the "external nodes", then you can assume that they are the NULL
pointers.
Now, the preorder traversal of any tree labeled as I have said forms a word belonging to Lukasiewicz language. It could be shown that this word is unique. That is, for any pair of trees t1
and t2
, code(t1) != code(t2)
; always! In addition (and this should have been the reason for which you are concerned to Huffman encoding), the Lukasiewicz language is prefix.
For example, for the tree
Its preorder traversal is aaaabbabbaabbbababb
or 000011011001110111
I leave to you the code for generating a code; it is a preorder traversal. If you are interested in reversing it, that is, to get the tree given the code, then this pseudo-code, which tries to answer your question, would do it:
Node * code_to_tree(char *& code)
{
if (*code++ == 'b')
return nullptr;
Node * p = new Node;
LLINK(p) = code_to_tree(code);
RLINK(p) = code_to_tree(code);
return p;
}
In a real implementation, you would read bits.
The routine above assumes that the code is right; that is, it was generated from a binary tree. Note that not all words composed by a
's and b
's belong to the Lukasiewicz language. But some showable rules could be applied on them.
First, the number of b
's must be exactly the number of a
's plus one. Second, if each a
weights one (1) and each b
weights minus one (-1), then, at the exception of last b
, the addition through all the word for each letter never can be smaller than zero. Only at the end of the word the addition will be -1
. If the word does not satisfy this condition, then you can assume that it is invalid.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With