Shannon-Fano coding is another compression technique used to reduce the number of bits needed to store a message, however unlike Huffman coding, it does not use a bottom-up approach and neither does it generate minimal codes.

The following section summarizes the simple encoding process.

**Example sentence**: `This_is_an_example`

## Step 1: Preparing

Count the frequencies of existing letters

```
A E H I L M N P S T X _
2 2 1 2 1 1 1 1 2 1 1 3
```

Sort symbols based on their frequencies

```
H L M N P T X A E I S _
1 1 1 1 1 1 1 2 2 2 2 3
```

## Step 2: Combining (Top-Down)

Recursively divide the set of symbols into two parts, each with approximately same number of frequencies. The process of splitting two symbols has been omitted in the following section.

### Step 2.1

Total AEISHLMNPTX: 18 => ~9 each.

```
AEISHLMNPTX_
/ \
AEISH LMNPTX_
```

### Step 2.2

Total AEISH: 9 => ~4.5 each.

Total LMNPTX_: 9 => ~4.5 each.

```
AEISHLMNPTX_
/ \
AEISH LMNPTX_
/ \ / \
AE ISH LMNP TX_
```

### Step 2.3

Total ISH: 5 => 2.5 each.

Total LMNP: 4 => 2 each.

Total TX_: 5 => 2.5 each.

```
AEISHLMNPTX_
/ \
AEISH LMNPTX_
/ \ / \
AE ISH LMNP TX_
/ \ / \ / \ / \
A E I SH LM NP TX _
```

### Step 2.4

And we finally get the desired tree…

```
AEISHLMNPTX_
/ \
AEISH LMNPTX_
/ \ / \
AE ISH LMNP TX_
/ \ / \ / \ / \
A E I SH LM NP TX _
/\ /\ /\ /\
S H L M N P T X
```

## Step 3: Assigning Codes

Now arbitrary declare that the left path of every parent node is indicating the number 0 and the right path respectivly 1.

## Analysis

The above encoding process leads to the following result:

```
Symbol Code Frequency Code Total
Length Length
-----------------------------------------
A 000 2 3 6
E 001 2 3 6
I 010 2 3 6
S 0110 2 4 8
H 0111 1 4 4
L 1000 1 4 4
M 1001 1 4 4
N 1010 1 4 4
P 1011 1 4 4
T 1100 1 4 4
X 1101 1 4 4
_ 111 3 3 9
```

Total symbols: 18 (sum of frequencies)

Bits needed with Shannon-Fano coding: 63 Bit (sum of total length)

Bits needed without Shannon-Fano coding: 64 Bit (see below)

Calculating how many Bits we need when not using Shannon-Fano coding:

```
12 (2^4) unique symbols => 4 Bit/Symbol needed
Total Bits needed: 4 Bit * 18 = 64 Bit
```

## Leave a Reply