Speed of memcpy() greatly influenced by different ways of malloc()

Question

I wrote a program to test the speed of memcpy(). However, how memory are allocated greatly influences the speed.

CODE

#include<stdlib.h>
#include<stdio.h>
#include<sys/time.h>

void main(int argc, char *argv[]){
    unsigned char * pbuff_1;
    unsigned char * pbuff_2;
    unsigned long iters = 1000*1000;

    int type = atoi(argv[1]);
    int buff_size = atoi(argv[2])*1024;

    if(type == 1){ 
        pbuff_1 = (void *)malloc(2*buff_size);
        pbuff_2 = pbuff_1+buff_size;
    }else{
        pbuff_1 = (void *)malloc(buff_size);
        pbuff_2 = (void *)malloc(buff_size);
    }   

    for(int i = 0; i < iters; ++i){
        memcpy(pbuff_2, pbuff_1, buff_size);
    }   

    if(type == 1){ 
        free(pbuff_1);
    }else{
        free(pbuff_1);
        free(pbuff_2);
    }   
}

The OS is linux-2.6.35 and the compiler is GCC-4.4.5 with options "-std=c99 -O3".

Results on my computer(memcpy 4KB, iterate 1 million times):

time ./test.test 1 4

real    0m0.128s
user    0m0.120s
sys 0m0.000s

time ./test.test 0 4

real    0m0.422s
user    0m0.420s
sys 0m0.000s

This question is related with a previous question:

Why does the speed of memcpy() drop dramatically every 4KB?

UPDATE

The reason is related with GCC compiler, and I compiled and run this program with different versions of GCC:

GCC version--------4.1.3--------4.4.5--------4.6.3

Time Used(1)-----0m0.183s----0m0.128s----0m0.110s

Time Used(0)-----0m1.788s----0m0.422s----0m0.108s

It seems GCC is getting smarter.

Peter G. · Accepted Answer

The specific addresses returned by malloc are selected by the implementation and not always optimal for the using code. You already know that the speed of moving memory around depends greatly on cache and page effects.

Here, the specific pointers malloced are not known. You could print them out using printf("%p", ptr). What is known however, is that using just one malloc for two blocks surely avoids page and cache waste between the two blocks. That may already be the reason for the speed difference.

Speed of memcpy() greatly influenced by different ways of malloc()

Tags:

performance

malloc

gcc

cpu-cache

memcpy

CODE

UPDATE

foool

1 Answers

Peter G.

Recent Activity

Donate For Us

Speed of memcpy() greatly influenced by different ways of malloc()

Tags:

performance

malloc

gcc

cpu-cache

memcpy

CODE

UPDATE

foool

1 Answers

Peter G.

Related questions

Recent Activity

Donate For Us