Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Memset an int (16 bit) array to short's max value

Tags:

c++

c

memset

Can't seem to find the answer to this anywhere, How do I memset an array to the maximum value of the array's type? I would have thought memset(ZBUFFER,0xFFFF,size) would work where ZBUFFER is a 16bit integer array. Instead I get -1s throughout.

Also, the idea is to have this work as fast as possible (it's a zbuffer that needs to initialize every frame) so if there is a better way (and still as fast or faster), let me know.

edit: as clarification, I do need a signed int array.

like image 470
DanielST Avatar asked Apr 11 '13 11:04

DanielST


4 Answers

In C++, you would use std::fill, and std::numeric_limits.

#include <algorithm>
#include <iterator>
#include <limits>

template <typename IT>
void FillWithMax( IT first, IT last )
{
    typedef typename std::iterator_traits<IT>::value_type T;
    T const maxval = std::numeric_limits<T>::max();
    std::fill( first, last, maxval );
}

size_t const size=32;
short ZBUFFER[size];
FillWithMax( ZBUFFER, &ZBUFFER[0]+size );

This will work with any type.

In C, you'd better keep off memset that sets the value of bytes. To initialize an array of other types than char (ev. unsigned), you have to resort to a manual for loop.

like image 88
Didier Trosset Avatar answered Oct 31 '22 03:10

Didier Trosset


-1 and 0xFFFF are the same thing in a 16 bit integer using a two's complement representation. You are only getting -1 because either you have declared your array as short instead of unsigned short. Or because you are converting the values to signed when you output them.

BTW your assumption that you can set something except bytes using memset is wrong. memset(ZBUFFER, 0xFF, size) would have done the same thing.

like image 24
john Avatar answered Oct 31 '22 01:10

john


In C++ you can fill an array with some value with the std::fill algorithm.

std::fill(ZBUFFER, ZBUFFER+size, std::numeric_limits<short>::max());

This is neither faster nor slower than your current approach. It does have the benefit of working, though.

like image 27
R. Martinho Fernandes Avatar answered Oct 31 '22 03:10

R. Martinho Fernandes


Don't attribute speed to language. That's for implementations of C. There are C compilers that produce fast, optimal machine code and C compilers that produce slow, inoptimal machine code. Likewise for C++. A "fast, optimal" implementation might be able to optimise code that seems slow. Hence, it doesn't make sense to call one solution faster than another. I'll talk about the correctness, and then I'll talk about performance, however insignificant it is. It'd be a better idea to profile your code, to be sure that this is in fact the bottleneck, but let's continue.

Let us consider the most sensible option, first: A loop that copies int values. It is clear just by reading the code that the loop will correctly assign SHRT_MAX to each int item. You can see a testcase of this loop below, which will attempt to use the largest possible array allocatable by malloc at the time.

#include <limits.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int main(void) {
    size_t size = SIZE_MAX;
    volatile int *array = malloc(size);

    /* Allocate largest array */
    while (array == NULL && size > 0) {
        size >>= 1;
        array = malloc(size);
    }

    printf("Copying into %zu bytes\n", size);

    for (size_t n = 0; n < size / sizeof *array; n++) {
        array[n] = SHRT_MAX;
    }

    puts("Done!");
    return 0;
}

I ran this on my system, compiled with various optimisations enabled (-O3 -march=core2 -funroll-loops). Here's the output:

Copying into 1073741823 bytes
Done!

Process returned 0 (0x0)   execution time : 1.094 s
Press any key to continue.

Note the "execution time"... That's pretty fast! If anything, the bottleneck here is the cache locality of such a large array, which is why a good programmer will try to design systems that don't use so much memory... Well, then let us consider the memset option. Here's a quote from the memset manual:

The memset() function copies c (converted to an unsigned char) into each of the first n bytes of the object pointed to by s.

Hence, it'll convert 0xFFFF to an unsigned char (and potentially truncate that value), then assign the converted value to the first size bytes. This results in incorrect behaviour. I don't like relying upon the value SHRT_MAX to be represented as a sequence of bytes storing the value (unsigned char) 0xFFFF, because that's relying upon coincidence. In other words, the main problem here is that memset isn't suitable for your task. Don't use it. Having said that, here's a test, derived from the test above, which will be used to test the speed of memset:

#include <limits.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int main(void) {
    size_t size = SIZE_MAX;
    volatile int *array = malloc(size);

    /* Allocate largest array */
    while (array == NULL && size > 0) {
        size >>= 1;
        array = malloc(size);
    }

    printf("Copying into %zu bytes\n", size);

    memset(array, 0xFFFF, size);

    puts("Done!");
    return 0;
}

A trivial byte-copying memset loop will iterate sizeof (int) times more than the loop in my first example. Considering that my implementation uses a fairly optimal memset, here's the output:

Copying into 1073741823 bytes
Done!

Process returned 0 (0x0)   execution time : 1.060 s
Press any key to continue.

These tests are likely to vary, however significantly. I only ran them once each to get a rough idea. Hopefully you've come to the same conclusion that I have: Common compilers are pretty good at optimising simple loops, and it's not worth postulating about micro-optimisations here.

In summary:

  1. Don't use memset to fill ints with values (with an exception for the value 0), because it's not suitable.
  2. Don't postulate about optimisations prior to running tests. Don't run tests until you have a working solution. By working solution I mean "A program that solves an actual problem". Once you have that, use your profiler to identify more significant opportunities to optimise!
like image 31
autistic Avatar answered Oct 31 '22 02:10

autistic