Huffman Code ADA


#1

Suppose the letters a, b, c, d, e, f have probabilities 1/2, 1/4, 1/8, 1/16, 1/32, 1/32
respectively. What is the average length of Huffman codes?
(i) 3
(ii) 2.1875
(iii) 2.25
(iv) 1.19375


#2

Convert the given probabilities into Integers by multiplying with the LCM of the denominators, i.e. 32.

[a:16]    [b:8]    [c:4]    [d:2]    [e:1]     [f:1]

Now the huffman tree will look something like this,

                                                                   32 (root)
                                                                 /    \
                                                               16     [a:16]
                                                              /  \
                                                            8     [b:8]
                                                          /  \
                                                         4    [c:4]
                                                       /  \
                                                      2    [d:2]
                                                     / \
                                                 [e:1]  [f:1]

Now start from root and see how many edges you travel to reach the letters.This will give you the number of bits required to represent the letters.

a -> 1
b -> 2
c -> 3
d -> 4
e -> 5
f -> 5

Now the average length = ∑(probabilities of each letter) * (bits required to represent them)

= (1/2)(1) + (1/4)(2) + (1/8)(3) + (1/16)(4) + (1/32)(5) + (1/32)(5)

= 1.19375

So option (iv) is correct.

Points to Note:

  • Huffman coding is a variable length encoding methodology, i.e. for each symbol or letter we are using non-uniform number of bits based on their frequency of occurrence.

  • Time complexity of this algorithm = O(nlogn)

  • It is a greedy algorithm.

  • It is used to compress the size of a file.

  • It is always optimal compared to fixed length encoding.