chess:programming:huffman:example
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
chess:programming:huffman:example [2021/10/29 13:28] – peter | chess:programming:huffman:example [2021/10/29 13:56] (current) – [Stepping through the code] peter | ||
---|---|---|---|
Line 172: | Line 172: | ||
---- | ---- | ||
- | Now we just need an algorithm | + | ===== Build the low-cost tree ===== |
+ | |||
+ | < | ||
+ | Huffman (C) | ||
+ | n = the size of C | ||
+ | |||
+ | // Insert all the elements of C into Q, using the value of the node as the priority. | ||
+ | for i in 1..n-1 do | ||
+ | z = a new tree node | ||
+ | x = Extract-Minimum (Q) | ||
+ | y = Extract-Minimum (Q) | ||
+ | left node of z = x | ||
+ | right node of z = y | ||
+ | f[z] = f[x] + f[y] | ||
+ | Insert (Q, z) | ||
+ | end for | ||
+ | |||
+ | return Extract-Minimum (Q) as the complete tree | ||
+ | </ | ||
+ | |||
+ | <WRAP info> | ||
+ | **NOTE: | ||
+ | |||
+ | * **B(T)** is defined as above; | ||
+ | * it can be stored efficiently in an array indexed by characters. | ||
+ | * It is extended as needed to accommodate the values of internal tree nodes. | ||
+ | * **C** is the set of characters represented as leaf tree nodes. | ||
+ | * **Q** is a priority queue of tree nodes where we can quickly extract the minimum element; | ||
+ | * this can be done with a heap where the heap property is reversed. | ||
+ | * We build the tree in a bottom up manner, starting with the individual characters and ending up with the root of the tree as the only element of the queue. | ||
+ | |||
+ | At first, the queue contains all the leaf nodes as a " | ||
+ | |||
+ | * Then the two nodes with least value are grabbed out of the queue, joined by a new tree node, and put back on the queue. | ||
+ | * After n-1 iterations, there is only one node left in the queue: the root of the tree. | ||
+ | |||
+ | </ | ||
+ | |||
+ | ---- | ||
+ | |||
+ | ==== Stepping through the code using the Huffman algorithm ==== | ||
+ | |||
+ | Here are the contents of **Q** after each step through the for loop: | ||
+ | |||
+ | Initially, all nodes are leaf nodes. | ||
+ | |||
+ | We stick all 8 in Q: | ||
+ | |||
+ | < | ||
+ | (space) | ||
+ | 17 12 4 5 | ||
+ | </ | ||
+ | |||
+ | We join two of the nodes with least value; now there are 7 things in Q: | ||
+ | |||
+ | < | ||
+ | 8 | ||
+ | / \ | ||
+ | | ||
+ | | ||
+ | </ | ||
+ | |||
+ | ---- | ||
+ | |||
+ | Then the next two with least value, Q has 6 elements: | ||
+ | |||
+ | < | ||
+ | | ||
+ | / \ | ||
+ | | ||
+ | | ||
+ | </ | ||
+ | |||
+ | ---- | ||
+ | |||
+ | Now the two nodes with least values are the two trees we just made, and Q has 5 elements: | ||
+ | |||
+ | < | ||
+ | 17 | ||
+ | __/ \__ | ||
+ | | ||
+ | / \ | ||
+ | | ||
+ | / \ / \ | ||
+ | | ||
+ | | ||
+ | < | ||
+ | |||
+ | ---- | ||
+ | |||
+ | The next two with least value, Q has 4 elements: | ||
+ | |||
+ | < | ||
+ | 17 | ||
+ | __/ \__ | ||
+ | | ||
+ | / | ||
+ | | ||
+ | / \ / \ | ||
+ | | ||
+ | | ||
+ | </ | ||
+ | |||
+ | ---- | ||
+ | |||
+ | The next two with least value, three items left: | ||
+ | |||
+ | < | ||
+ | 34 | ||
+ | ______/ | ||
+ | | ||
+ | / (space) | ||
+ | 17 17 | ||
+ | __/ \__ | ||
+ | | ||
+ | / | ||
+ | | ||
+ | / \ / \ e a d | ||
+ | | ||
+ | | ||
+ | </ | ||
+ | |||
+ | ---- | ||
+ | |||
+ | The next two with least value, two big trees left: | ||
+ | |||
+ | < | ||
+ | 34 | ||
+ | ______/ | ||
+ | | ||
+ | / (space) | ||
+ | 17 17 | ||
+ | __/ | ||
+ | | ||
+ | / | ||
+ | | ||
+ | / \ / \ e a | ||
+ | | ||
+ | | ||
+ | </ | ||
+ | |||
+ | Finally, we join the whole thing up: | ||
+ | |||
+ | < | ||
+ | 77 | ||
+ | | ||
+ | / \ | ||
+ | 34 43 | ||
+ | ______/ | ||
+ | | ||
+ | / (space) | ||
+ | 17 | ||
+ | __/ | ||
+ | | ||
+ | / \ | ||
+ | | ||
+ | / \ / \ | ||
+ | | ||
+ | | ||
+ | </ | ||
+ | |||
+ | <WRAP info> | ||
+ | **NOTE:** At each point, we chose the joining that would force less frequent characters to be deeper in the tree. | ||
+ | </ | ||
+ | |||
+ | ---- | ||
+ | |||
+ | So an optimal prefix code is: | ||
+ | |||
+ | < | ||
+ | 01 (space) | ||
+ | 101 a | ||
+ | 0010 b | ||
+ | 0011 c | ||
+ | 11 d | ||
+ | 100 e | ||
+ | 0000 f | ||
+ | 0001 . | ||
+ | </ | ||
+ | |||
+ | And | ||
+ | |||
+ | < | ||
+ | B(T) = 17 * 2 + 12 * 3 + 4 * 4 + 5 * 4 + 19 * 2 + 12 * 3 + 4 * 4 = 196 bits | ||
+ | </ | ||
+ | |||
+ | <WRAP info> | ||
+ | **NOTE: | ||
+ | </ | ||
chess/programming/huffman/example.1635514139.txt.gz · Last modified: 2021/10/29 13:28 by peter