If I have an unsigned char *data
pointer and I want to check whether size_t length
of the data at that pointer is NULL, what would be the fastest way to do that? In other words, what's the fastest way to make sure a region of memory is blank?
I am implementing in iOS, so you can assume iOS frameworks are available, if that helps. On the other hand, simple C approaches (memcmp
and the like) are also OK.
Note, I am not trying to clear the memory, but rather trying to confirm that it is already clear (I am trying to find out whether there is anything at all in some bitmap data, if that helps). For example, I think the following would work, though I have not tried it yet:
- BOOL data:(unsigned char *)data isNullToLength:(size_t)length {
unsigned char tester[length] = {};
memset(tester, 0, length);
if (memcmp(tester, data, length) != 0) {
return NO;
}
return YES;
}
I would rather not create a tester array, though, because the source data may be quite large and I'd rather avoid allocating memory for the test, even temporarily. But I may just being too conservative there.
Thanks to everyone for the great responses below. I decided to create a test app to see how these performed, the answers surprised me, so I thought I'd share them. First I'll show you the version of the algorithms I used (in some cases they differ slightly from those proposed) and then I'll share some results from the field.
First I created some sample data:
size_t length = 1024 * 768;
unsigned char *data = (unsigned char *)calloc(sizeof(unsigned char), (unsigned long)length);
int i;
int count;
long check;
int loop = 5000;
Each test consisted of a loop run loop
times. During the loop some random data was added to and removed from the data
byte stream. Note that half the time there was actually no data added, so half the time the test should not find any non-zero data. Note the testZeros
call is a placeholder for calls to the test routines below. A timer was started before the loop and stopped after the loop.
count = 0;
for (i=0; i<loop; i++) {
int r = random() % length;
if (random() % 2) { data[r] = 1; }
if (! testZeros(data, length)) {
count++;
}
data[r] = 0;
}
Test A: nullToLength. This was more or less my original formulation above, debugged and simplified a bit.
- (BOOL)data:(void *)data isNullToLength:(size_t)length {
void *tester = (void *)calloc(sizeof(void), (unsigned long)length);
int test = memcmp(tester, data, length);
free(tester);
return (! test);
}
Test B: allZero. Proposal by Carrotman.
BOOL allZero (unsigned char *data, size_t length) {
bool allZero = true;
for (int i = 0; i < length; i++){
if (*data++){
allZero = false;
break;
}
}
return allZero;
}
Test C: is_all_zero. Proposed by Lundin.
BOOL is_all_zero (unsigned char *data, size_t length)
{
BOOL result = TRUE;
unsigned char* end = data + length;
unsigned char* i;
for(i=data; i<end; i++) {
if(*i > 0) {
result = FALSE;
break;
}
}
return result;
}
Test D: sumArray. This is the top answer from the nearly duplicate question, proposed by vladr.
BOOL sumArray (unsigned char *data, size_t length) {
int sum = 0;
for (int i = 0; i < length; ++i) {
sum |= data[i];
}
return (sum == 0);
}
Test E: lulz. Proposed by Steve Jessop.
BOOL lulz (unsigned char *data, size_t length) {
if (length == 0) return 1;
if (*data) return 0;
return memcmp(data, data+1, length-1) == 0;
}
Test F: NSData. This is a test using NSData object I discovered in the iOS SDK while working on all of these. It turns out Apple does have an idea of how to compare byte streams that is designed to be hardware independent.
- (BOOL)nsdTestData: (NSData *)nsdData length: (NSUInteger)length {
void *tester = (void *)calloc(sizeof(void), (unsigned long)length);
NSData *nsdTester = [NSData dataWithBytesNoCopy:tester length:(NSUInteger)length freeWhenDone:NO];
int test = [nsdData isEqualToData:nsdTester];
free(tester);
return (test);
}
So how did these approaches compare? Here are two sets of data, each representing 5000 loops through the check. First I tried this on the iPhone Simulator running on a relatively old iMac, then I tried this running on a first generation iPad.
On the iPhone 4.3 Simulator running on an iMac:
// Test A, nullToLength: 0.727 seconds
// Test F, NSData: 0.727
// Test E, lulz: 0.735
// Test C, is_all_zero: 7.340
// Test B, allZero: 8.736
// Test D, sumArray: 13.995
On a first generation iPad:
// Test A, nullToLength: 21.770 seconds
// Test F, NSData: 22.184
// Test E, lulz: 26.036
// Test C, is_all_zero: 54.747
// Test B, allZero: 63.185
// Test D, sumArray: 84.014
These are just two samples, I ran the test many times with only slightly varying results. The order of performance was always the same: A & F very close, E just behind, C, B, and D. I'd say that A, F, and E are virtual ties, on iOS I'd prefer F because it takes advantage of Apple's protection from processor change issues, but A & E are very close. The memcmp approach clearly wins over the simple loop approach, close to ten times faster in the simulator and twice as fast on the device itself. Oddly enough, D, the winning answer from the other thread performed very poorly in this test, probably because it does not break out of the loop when it hits the first difference.
Memory address 0 is called the null pointer. Your program is never allowed to look at or store anything into memory address 0, so the null pointer is a way of saying "a pointer to nothing".
The free function causes the space pointed to by ptr to be deallocated, that is, made available for further allocation. If ptr is a null pointer, no action occurs.
I think you should do it with an explicit loop, but just for lulz:
if (length == 0) return 1;
if (*pdata) return 0;
return memcmp(pdata, pdata+1, length-1) == 0;
Unlike memcpy
, memcmp
does not require that the two data sections don't overlap.
It may well be slower than the loop, though, because the un-alignedness of the input pointers means there probably isn't much the implementation of memcmp
can do to optimize, plus it's comparing memory with memory rather than memory with a constant. Easy enough to profile it and find out.
Not sure if it's the best, but I probably would do something like this:
bool allZero = true;
for (int i = 0; i < size_t; i++){
if (*data++){
//Roll back so data points to the non-zero char
data--;
//Do whatever is needed if it isn't zero.
allZero = false;
break;
}
}
If you've just allocated this memory, you can always call calloc rather than malloc (calloc requires that all the data is zeroed out). (Edit: reading your comment on the first post, you don't really need this. I'll just leave it just in case)
If you're allocating the memory yourself, I'd suggest using the calloc()
function. It's just like malloc()
, except it zeros out the buffer first. It's what's used to allocate memory for Objective-C objects and is the reason that all ivars default to 0.
On the other hand, if this is a statically declared buffer, or a buffer you're not allocating yourself, memset()
is the easy way to do this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With