 #1

Suppose the letters a, b, c, d, e, f have probabilities 1/2, 1/4, 1/8, 1/16, 1/32, 1/32
respectively. What is the average length of Huffman codes?
(i) 3
(ii) 2.1875
(iii) 2.25
(iv) 1.19375

#2

Convert the given probabilities into Integers by multiplying with the LCM of the denominators, i.e. 32.

``````[a:16]    [b:8]    [c:4]    [d:2]    [e:1]     [f:1]
``````

Now the huffman tree will look something like this,

``````                                                                   32 (root)
/    \
16     [a:16]
/  \
8     [b:8]
/  \
4    [c:4]
/  \
2    [d:2]
/ \
[e:1]  [f:1]
``````

Now start from root and see how many edges you travel to reach the letters.This will give you the number of bits required to represent the letters.

a -> 1
b -> 2
c -> 3
d -> 4
e -> 5
f -> 5

Now the average length = ∑(probabilities of each letter) * (bits required to represent them)

= (1/2)(1) + (1/4)(2) + (1/8)(3) + (1/16)(4) + (1/32)(5) + (1/32)(5)

= 1.19375

So option (iv) is correct.

Points to Note:

• Huffman coding is a variable length encoding methodology, i.e. for each symbol or letter we are using non-uniform number of bits based on their frequency of occurrence.

• Time complexity of this algorithm = O(nlogn)

• It is a greedy algorithm.

• It is used to compress the size of a file.

• It is always optimal compared to fixed length encoding.