SHA256 performance optimization in C

Question

I need to hash a big database of values quite often. Thus, a fast implementation of a SHA-2 hasher is needed. I'm currently using the SHA256.

The sha256_transform algorithm I'm using right now is this: http://bradconte.com/sha256_c (code below)

I have profiled my code and this snippet is taking exactly 96% of computing time per hash, making this function critical to my goals.

It operates on a 64-byte long binary string named data[] and outputs the result in ctx->state.

I ask for a faster version of this function. Keep in mind that even slight modifications can impact speed negatively.

#define uchar unsigned char
#define uint unsigned int

#define ROTLEFT(a,b) (((a) << (b)) | ((a) >> (32-(b))))
#define ROTRIGHT(a,b) (((a) >> (b)) | ((a) << (32-(b))))

#define CH(x,y,z) (((x) & (y)) ^ (~(x) & (z)))
#define MAJ(x,y,z) (((x) & (y)) ^ ((x) & (z)) ^ ((y) & (z)))
#define EP0(x) (ROTRIGHT(x,2) ^ ROTRIGHT(x,13) ^ ROTRIGHT(x,22))
#define EP1(x) (ROTRIGHT(x,6) ^ ROTRIGHT(x,11) ^ ROTRIGHT(x,25))
#define SIG0(x) (ROTRIGHT(x,7) ^ ROTRIGHT(x,18) ^ ((x) >> 3))
#define SIG1(x) (ROTRIGHT(x,17) ^ ROTRIGHT(x,19) ^ ((x) >> 10))

void sha256_transform(SHA256_CTX *ctx, uchar data[]) {
    uint a,b,c,d,e,f,g,h,i,j,t1,t2,m[64];

    a = ctx->state[0];
    b = ctx->state[1];
    c = ctx->state[2];
    d = ctx->state[3];
    e = ctx->state[4];
    f = ctx->state[5];
    g = ctx->state[6];
    h = ctx->state[7];

    for (i=0,j=0; i < 16; i++, j += 4)
        m[i] = (data[j] << 24) | (data[j+1] << 16) | (data[j+2] << 8) | (data[j+3]);

    for ( ; i < 64; i++)
        m[i] = SIG1(m[i-2]) + m[i-7] + SIG0(m[i-15]) + m[i-16];

    for (i = 0; i < 64; ++i) {
        t1 = h + EP1(e) + CH(e,f,g) + k[i] + m[i];
        t2 = EP0(a) + MAJ(a,b,c);
        h = g;
        g = f;
        f = e;
        e = d + t1;
        d = c;
        c = b;
        b = a;
        a = t1 + t2;
    }

    ctx->state[0] += a;
    ctx->state[1] += b;
    ctx->state[2] += c;
    ctx->state[3] += d;
    ctx->state[4] += e;
    ctx->state[5] += f;
    ctx->state[6] += g;
    ctx->state[7] += h;
}

TheCodeArtist · Accepted Answer

You may want to checkout/profile this implementation of SHA256.

Being used in cgminer (a popular bitcoin mining software), it is written specifically keeping performance in mind. It includes 4-way SIMD implementations using SSE2. It follows the same approach as the bradconte sha256_transform algorithm mentioned in the question. The code is too long to reproduce here.

Also the license is fairly permissive, allowing re-use/distribution as long as the original authors are accredited.

SHA256 performance optimization in C

Tags:

c

optimization

sha256

user2464424

1 Answers

TheCodeArtist

Recent Activity

Donate For Us

SHA256 performance optimization in C

Tags:

c

optimization

sha256

user2464424

1 Answers

TheCodeArtist

Related questions

Recent Activity

Donate For Us