I'm trying to implement the Huffman's encoding algorithm in c++.
my question is : after i got the equivalent binary string for each character , how can i write those zeros and ones as binary on a file not as string 0 or string 1 ?
thanks in advance ...
I hope this code can help you.
char byte
)else
branch can be removed if byte
is set to 0
after each file writing operation (or, more generically, every time it has been totally filled and flushed somewhere else), so only 1s
must be written.void writeBinary(char *huffmanEncoding, int sequenceLength)
{
char byte = 0;
// For each bit of the sequence
for (int i = 0; i < sequenceLength; i++) {
char bit = huffmanEncoding[i];
// Add a single bit to byte
if (bit == 1) {
// MSB of the sequence to msb of the file
byte |= (1 << (7 - (i % 8)));
// equivalent form: byte |= (1 << (-(i + 1) % 8);
}
else {
// MSB of the sequence to msb of the file
byte &= ~(1 << (7 - (i % 8)));
// equivalent form: byte &= ~(1 << (-(i + 1) % 8);
}
if ((i % 8) == 0 && i > 0) {
//writeByteToFile(byte);
}
}
// Fill the last incomplete byte, if any, and write to file
}
Obtaining individually the encoding of each character in a different data structure is a broken solution, because you need to juxtapose the encoding of each character in the resulting binary file: storing them individually makes that as hard as directly storing them contiguously in a vector of bits.
This consideration suggests using a std::vector<bool>
to perform your task, but it is a broken solution because it can't be treated as a c-style array, and you really need that at output time.
This question asks precisely which are the valid alternatives to std::vector<bool>
, so I think answers to that question fits perfectly your question.
BTW, what I would do is to just wrap a std::vector<uint8_t>
under a class which suits yout needs, like the code attached:
#include <iostream>
#include <vector>
#include <cstdint>
#include <algorithm>
class bitstream {
private:
std::vector<std::uint8_t> storage;
unsigned int bits_used:3;
void alloc_space();
public:
bitstream() : bits_used(0) { }
void push_bit(bool bit);
template <typename T>
void push(T t);
std::uint8_t *get_array();
size_t size() const;
// beware: no reference!
bool operator[](size_t pos) const;
};
void bitstream::alloc_space()
{
if (bits_used == 0) {
std::uint8_t push = 0;
storage.push_back(push);
}
}
void bitstream::push_bit(bool bit)
{
alloc_space();
storage.back() |= bit << 7 - bits_used++;
}
template <typename T>
void bitstream::push(T t)
{
std::uint8_t *t_byte = reinterpret_cast<std::uint8_t*>(&t);
for (size_t i = 0; i < sizeof(t); i++) {
uint8_t byte = t_byte[i];
if (bits_used > 0) {
storage.back() |= byte >> bits_used;
std::uint8_t to_push = (byte & ((1 << (8 - bits_used)) - 1)) << bits_used;
storage.push_back(to_push);
} else {
storage.push_back(byte);
}
}
}
std::uint8_t *bitstream::get_array()
{
return &storage.front();
}
size_t bitstream::size() const
{
const unsigned int m = 0;
return std::max(m, (storage.size() - 1) * 8 + bits_used);
}
bool bitstream::operator[](size_t size) const
{
// No range checking
return static_cast<bool>((storage[size / 8] >> 7 - (size % 8)) & 0x1);
}
int main(int argc, char **argv)
{
bitstream bs;
bs.push_bit(true);
std::cout << bs[0] << std::endl;
bs.push_bit(false);
std::cout << bs[0] << "," << bs[1] << std::endl;
bs.push_bit(true);
bs.push_bit(true);
std::uint8_t to_push = 0xF0;
bs.push_byte(to_push);
for (size_t i = 0; i < bs.size(); i++)
std::cout << bs[i] << ",";
std::cout << std::endl;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With