Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

File compression with C++

I want to make my own text file compression program. I don't know much about C++ programming, but I have learned all the basics and writing/reading a file. I have searched on google a lot about compression, and saw many different kind of methods to compress a file like LZW and Huffman. The problem is that most of them don't have a source code, or they have a very complicated one. I want to ask if you know any good webpages where I can learn and make a compression program myself?

EDIT: I will let this topic be open for a little longer, since I plan to study this the next few days, and if I have any questions, I'll ask them here.

like image 848
Janman Avatar asked Apr 06 '11 18:04

Janman


4 Answers

Most of the algorithms are pretty complex. But they all have in common that they are taking data that is repeating and only store it once and have a system of knowing how to uncompress them (putting the repeated segments back in place)

Here is a simple example you can try to implement.

We have this data file

XXXXFGGGJJ

DDDDDDDDAA

XXXXFGGGJJ

Here we have chars that repeat and two lines that repeat. So you could start with finding a way to reduce the filesize.

Here's a simple compression algorithm.

4XF3G2J

8D2A

4XF3G2J

So we have 4 of X, one of F, 3 of G etc.

like image 79
Ólafur Waage Avatar answered Nov 03 '22 13:11

Ólafur Waage


You can try this page it contains a clear walk through of the basics of compression and the first principles.

like image 34
Willem van Doesburg Avatar answered Nov 03 '22 15:11

Willem van Doesburg


Compression is not the most easy task. I took a college class to learn about compression algorithms like LZW and Huffman, and I can tell you that they're not that easy. If C++ is your first language and you're just starting into this sort of thing, I wouldn't recommend trying to write your own sorting algorithm just yet. If you are more experienced, then I would try writing source without any code being provided to you - this shows that you truly understand the compression algorithm.

This is how I was taught - the professor explained the algorithm in very broad terms, and then either we would implement it (in Java, mind you) or we would answer questions about how the algorithm would behave under certain circumstances. If we could do either of those, then we really knew the algorithm - without him showing us any source at all - it's a good skill to develop ;)

like image 45
adam_0 Avatar answered Nov 03 '22 13:11

adam_0


Huffman encoding tree's are not too complicated, I'd start with them. Here's a link: Example: Huffman Encoding Trees

like image 1
toby Avatar answered Nov 03 '22 14:11

toby