Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

clearing a small integer array: memset vs. for loop

People also ask

Is memset faster than a loop?

You can assume that memset will be at least as fast as a naive implementation such as the loop. Try it under a debug build and you will notice that the loop is not replaced. That said, it depends on what the compiler does for you. Looking at the disassembly is always a good way to know exactly what is going on.

How do you clear an array in memset?

Use the memset Function to Clear Char Array in C memset takes three arguments - the first is the void pointer to the memory region, the second argument is the constant byte value, and the last one denotes the number of bytes to be filled at the given memory address.

Why do we need memset?

memset() is used to fill a block of memory with a particular value. The syntax of memset() function is as follows : // ptr ==> Starting address of memory to be filled // x ==> Value to be filled // n ==> Number of bytes to be filled starting // from ptr to be filled void *memset(void *ptr, int x, size_t n);

Why does memset only work 0 and 1?

memset allows you to fill individual bytes as memory and you are trying to set integer values (maybe 4 or more bytes.) Your approach will only work on the number 0 and -1 as these are both represented in binary as 00000000 or 11111111 . Show activity on this post. Because memset works on byte and set every byte to 1.


In all likelihood, memset() will be inlined by your compiler (most compilers treat it as an 'intrinsic', which basically means it's inlined, except maybe at the lowest optimizations or unless explicitly disabled).

For example, here are some release notes from GCC 4.3:

Code generation of block move (memcpy) and block set (memset) was rewritten. GCC can now pick the best algorithm (loop, unrolled loop, instruction with rep prefix or a library call) based on the size of the block being copied and the CPU being optimized for. A new option -minline-stringops-dynamically has been added. With this option string operations of unknown size are expanded such that small blocks are copied by in-line code, while for large blocks a library call is used. This results in faster code than -minline-all-stringops when the library implementation is capable of using cache hierarchy hints. The heuristic choosing the particular algorithm can be overwritten via -mstringop-strategy. Newly also memset of values different from 0 is inlined.

It might be possible for the compiler to do something similar with the alternative examples you gave, but I'd bet it's less likely to.

And it's grep-able and more immediately obvious at a glance what the intent is to boot (not that the loop is particularly difficult to grok either).


As Michael already noted, gcc and I guess most other compilers optimize this already very well. For example gcc turns this

char arr[5];
memset(arr, 0, sizeof arr);

into

movl  $0x0, <arr+0x0>
movb  $0x0, <arr+0x4>

It doesn't get any better than that...


There's no way of answering the question without measuring. It will depend entirely on the compiler, cpu and runtime library implementations.

memset() can be bit of a "code smell", because it can be prone to buffer overflows, parameter reversals and has the unfortunate ability of only clearing 'byte-wise'. However it's a safe bet that it will be 'fastest' in all but extreme cases.

I tend to use a macro to wrap this to avoid some of the issues:

#define CLEAR(s) memset(&(s), 0, sizeof(s))

This sidesteps the size calculations and removes the problem of swapping the length and vlaue parameters.

In short, use memset() "under the hood". Write what you intend, and let the compiler worry about optimizations. Most are incredibly good at it.


Considering this code per se evrything is already been told. But if you consider it in its program, of which I don't know nothing, something else can be done. For example, if this code is to be executed every some time to clear an array, you could run a thread that constantly allocates a new array of zero elements assigned to a global variable, which your code, when needs the array to be cleared, simply points to.

This is a third option. Of course if you plan to run your code on a processor with at least two cores and this makes sense. Also the code must be run more than once to see the benefits. For only a one-time run, you could declare an array filled with zeros and then point to it when needed.

Hope this may help someone