Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

I wrote this Hamming Encoding code for class. Why is it so slow?

I wrote this for my OS class:

#include <iostream>
#include <fstream>

//encodes a file using the (8,4) Hamming Code.
//usage : HammingEncode.out < inputFile > outputFile 
int main() {
    unsigned char const codebook[] = {0x00, 0x1E, 0x2D, 0x33, 0x4B, 0x55, 0x66, 0x78, 0x87, 0x99, 0xAA, 0xB4, 0xCC, 0xD2, 0xE1, 0xFF};
    unsigned char in, nextByte;
    unsigned char const leftMask = 0xF0, rightMask = 0x0F;

    in = std::cin.get();
    while (!std::cin.eof()) {
        nextByte = (in & leftMask) >> 4;
        std::cout << codebook[nextByte];
        nextByte = in & rightMask;
        std::cout << codebook[nextByte];
        in = std::cin.get();
    }
}

I then decided to test its speed (just to see) on the Old Testamenet form the King James Bible. This was our standard test file for a Data Structures Class which was taught in Java, and we could sort it and Huffman Encode it in basically no time, yet this takes quite a long time to encode. What is going on?

like image 970
Jon Cohen Avatar asked Mar 24 '14 03:03

Jon Cohen


3 Answers

std::cin is open in text-mode, and as such it is constantly on the lookout for all kinds of things to watch for (like newlines, etc).

Given the constant character sniffing by the std::cin input stream, i'm not surprised it takes longer, but it does seem a little excessive. the following, bypassing iostream and using FILE stream directly will probably do what you were expecting:

#include <cstdlib>
#include <cstdio>

int main(int argc, char *argv[])
{
    static unsigned char const codebook[] =
    {
        0x00, 0x1E, 0x2D, 0x33, 0x4B, 0x55, 0x66, 0x78,
        0x87, 0x99, 0xAA, 0xB4, 0xCC, 0xD2, 0xE1, 0xFF
    };

    for (int c = std::fgetc(stdin); c!=EOF; c=std::fgetc(stdin))
    {
        std::fputc(codebook[c >> 4], stdout);
        std::fputc(codebook[c & 0x0F], stdout);
    }

    return EXIT_SUCCESS;
}

I tested the exact code above on an 10MB random file loaded with chars ranging from a to z, and the results were ridiculously long when using std::cin and std::cout. Using the FILE streams directly, the difference was enormous. All of the code in this answer was tested with Apple LLVM version 5.1 (clang-503.0.38) (based on LLVM 3.4svn) using -O3 optimization.

Using FILE streams

time ./hamming < bigfile.txt > bigfile.ham 
real 0m1.855s
user 0m1.812s
sys  0m0.041s

Using std::cin and std::cout

time ./hamming < bigfile.txt > bigfile.ham
real 0m23.819s
user 0m7.416s
sys  0m16.377s

Using std::cin and std::cout with std::cout.sync_with_stdio(false);

time ./hamming < bigfile.txt > bigfile.ham
real 0m24.867s
user 0m7.705s
sys  0m17.118s

In summary, ouch. Of note is the time spent in system. If I get a chance to update this with the std::istream::get() and put() methods I will, but honestly I don't expect any miracles on that. Unless there is some magic (to me, not to others) way of turning off io xlat from std::cin the FILE streams may be a reasonable alternative. I haven't yet investigated whether slurping std::cin's rdbuf() is a viable option either, but it may too have promise.


Edit: Using std::istreambuf_iterator<char>

Using the streambuf iterator class has a significant improvement, as it essentially bypasses all the inline slat junk, but it still isn't as efficient as FILE streams:

#include <iostream>
#include <cstdlib>
#include <cstdio>

int main(int argc, char *argv[])
{
    static unsigned char const codebook[] =
    {
        0x00, 0x1E, 0x2D, 0x33, 0x4B, 0x55, 0x66, 0x78,
        0x87, 0x99, 0xAA, 0xB4, 0xCC, 0xD2, 0xE1, 0xFF
    };

    std::istreambuf_iterator<char> cin_it(std::cin), cin_eof;
    std::for_each(cin_it, cin_eof, [](char c)
        {
          std::cout.put(static_cast<char>(codebook[static_cast<unsigned char>(c) >> 4]));
          std::cout.put(static_cast<char>(codebook[static_cast<unsigned char>(c) & 0x0F]));
        });

    return EXIT_SUCCESS;
}

Results:

time ./hamming < bigfile.txt > bigfile.ham

real 0m6.062s
user 0m5.795s
sys  0m0.053s

Note that system is now comparable to the FILE stream results, but the overhead from rest of the iostream templates in user seems a sore point (but still better than the other iostream attempts). You win some, you lose some =P


Edit: Unbuffered System IO

In effort to be completely fair, bypassing all runtime buffering and relying solely on the system calls to do this madness, the following is also worth noting:

#include <cstdlib>
#include <cstdio>
#include <unistd.h>

int main(int argc, char *argv[])
{
    static unsigned char const codebook[] =
    {
        0x00, 0x1E, 0x2D, 0x33, 0x4B, 0x55, 0x66, 0x78,
        0x87, 0x99, 0xAA, 0xB4, 0xCC, 0xD2, 0xE1, 0xFF
    };

    unsigned char c;
    while (read(STDIN_FILENO, &c, 1)> 0)
    {
        unsigned char duo[2] =
        {
            codebook[ c >> 4 ],
            codebook[ c & 0x0F ]
        };
        write(STDOUT_FILENO, duo, sizeof(duo));
    }

    return EXIT_SUCCESS;
}

The results, as you would expect, were dreadful:

time ./hamming < bigfile.txt > bigfile.ham

real 0m26.509s
user 0m2.370s
sys  0m24.087s
like image 173
WhozCraig Avatar answered Sep 20 '22 12:09

WhozCraig


I got close to an order of magnitude improvement by making 2 small changes.

  1. Adding std::ios_base::synch_with_stdio(false) (no noticeable difference, though the impact is often implementation specific)
  2. Buffering the output before writing (this made the most difference)

The updated code looks like this:

int main()
{

    //encodes a file using the (8,4) Hamming Code.
    //usage : HammingEncode.out < inputFile > outputFile 
        unsigned char const codebook[] = { 0x00, 0x1E, 0x2D, 0x33, 0x4B, 0x55, 0x66, 0x78, 0x87, 0x99, 0xAA, 0xB4, 0xCC, 0xD2, 0xE1, 0xFF };
        unsigned char in, nextByte;
        unsigned char const leftMask = 0xF0, rightMask = 0x0F;
        std::stringstream os;

        std::ios_base::sync_with_stdio(false);
        in = std::cin.get();
        while (std::cin) {
            nextByte = (in & leftMask) >> 4;
            os.put(codebook[nextByte]);
            nextByte = in & rightMask;
            os.put(codebook[nextByte]);
            in = std::cin.get();
        }
        std::cout << os.rdbuf();
}

Update

I tried one more implementation - using the underlying std::streambuf.

On my system the original code was taking around 14 seconds to process the full King James Bible - around 4.3 MB

The code in my original attempt took around 2.1 seconds to process.

This new implementation took 0.71 seconds to process the same document.

int main()
{

    //encodes a file using the (8,4) Hamming Code.
    //usage : HammingEncode.out < inputFile > outputFile 
    unsigned char const codebook[] = { 0x00, 0x1E, 0x2D, 0x33, 0x4B, 0x55, 0x66, 0x78, 0x87, 0x99, 0xAA, 0xB4, 0xCC, 0xD2, 0xE1, 0xFF };
    unsigned char in, nextByte;
    unsigned char const leftMask = 0xF0, rightMask = 0x0F;
    std::stringstream os;

    std::ios_base::sync_with_stdio(false);

std::streambuf * pbuf = std::cin.rdbuf();
    do {
        in = pbuf->sgetc();
        nextByte = (in & leftMask) >> 4;
        os << codebook[nextByte];
        nextByte = in & rightMask;
        os << codebook[nextByte];
    } while (pbuf->snextc() != EOF);
    std::cout << os.rdbuf();
}
like image 34
Peter R Avatar answered Sep 22 '22 12:09

Peter R


C++ iostreams has bad opinion for being rather inefficient, although different numbers suggest that it's rather quality-of-implementation issue, rather than iostream disadvantage.

Anyway, just to be sure that it's not anything like slow hdd you might compare execution time to e.g.

cat file1 > file2

Of cource cat will be a bit quicker, as it doesn't double the size of your data.

Then try to compare efficieny of your code with following:

#include <stdio.h>
#include <unistd.h>

int main()
{
    unsigned char buffer[1024*1024]; // 1MB buffer should be enough

    while (!eof(STDIN_FILENO)){
        size_t len = read(STDIN_FILENO, &buffer[0], 1024*1024);
        write(STDOUT_FILENO, &buffer[0], len);
        write(STDOUT_FILENO, &buffer[0], len);
    }

    return 0;
}

EDIT:

Sorry, my bad. Try

#include <stdio.h>
#include <unistd.h>

int main()
{
    unsigned char buffer[1024*1024]; // 1MB buffer should be enough

    size_t len = read(STDIN_FILENO, &buffer[0], 1024*1024);

    while(len > 0){
        write(STDOUT_FILENO, &buffer[0], len);
        write(STDOUT_FILENO, &buffer[0], len);
        len = read(STDIN_FILENO, &buffer[0], 1024*1024);
    }

    return 0;
}
like image 20
j_kubik Avatar answered Sep 21 '22 12:09

j_kubik