467px-ArbolCodigoHuffman.svgA Huffman compressor is a compressor based on the Huffman Codification. This algorithm examines the text counting the frequences of each character. Then, it makes a new codification using that frequences. For example, in order to reduce the space, it`s easy to see that if we store 'C' letter 5 times as 0, 'A' letter 2 times as 10 and 'B' letter 3 times as 11 we achieve reducing the size compared with binary codification:

 

 

 

  Huffman codification Binary codification
Text C   C   C    A   B   C   A   B  C  C   C   C   A   B   C   A  B  C
Codification 0    0    0  10  11   0  10  11 0 10  10  10  00 01 10  00 01
Total symbols 13 18

Note that the characters with greater frequency have a codification with less symbols than the characters with lesser frequency. That's the key. As you can guess, the way to make a codification is building a binary tree, each character is a leaf node that is placed in n-level based on its frequency.

Following this idea I have developed a simple application to demonstrate how we can reduce the size of any text, although the implementation is quite more difficult. Furthermore, We do not forget we have a custom codification for each text compressed so we need to store that codification too. You can get more information about Huffman codification here.

Huffman Compressor application is able to read a file text, to make a Huffman codification and to compress the text. Also, for academic purposes, the text can be compressed in separately blocks.

First, download the file here and then, follow the next instructions:

Step 1. Filling the form.

main

Step 2. Making a codification.

create codification

Step 3. Compressing.

compress

 Step 4. Watching results.

results

 Step 5. Decompressing.

deco