Is there a standard, strided version of memcpy?

Tags:

I have a column vector A which is 10 elements long. I have a matrix B which is 10 by 10. The memory storage for B is column major. I would like to overwrite the first row in B with the column vector A.

Clearly, I can do:

for ( int i=0; i < 10; i++ )
{
    B[0 + 10 * i] = A[i];
}

where I've left the zero in 0 + 10 * i to highlight that B uses column-major storage (zero is the row-index).

After some shenanigans in CUDA-land tonight, I had a thought that there might be a CPU function to perform a strided memcpy?? I guess at a low-level, performance would depend on the existence of a strided load/store instruction, which I don't recall there being in x86 assembly?

326

asked May 16 '11 06:05

M. Tibbits

1 Answers

Short answer: The code you have written is as fast as it's going to get.

Long answer: The memcpy function is written using some complicated intrinsics or assembly because it operates on memory operands that have arbitrary size and alignment. If you are overwriting a column of a matrix, then your operands will have natural alignment, and you won't need to resort to the same tricks to get decent speed.

177

answered Sep 22 '22 10:09

Dietrich Epp

Related questions
                            
                                Determine if peer has closed reading end of socket
                            
                                size of size_t compared to unsigned int
                            
                                C fgets versus fgetc for reading line
                            
                                How to use the poll C function to watch named pipes in Linux?
                            
                                Implementation of system calls / traps within Linux kernel source
                            
                                How to call C functions from ARM Assembly?
                            
                                Can I force a numpy ndarray to take ownership of its memory?
                            
                                printing very large floating point numbers
                            
                                Bit hacking and modulo operation
                            
                                Is the strict aliasing rule really a "two-way street"?
                            
                                Embedding Rust tasks in a C program?
                            
                                POSIX restrictions on pointer types in C
                            
                                Implementation of string pattern matching using Suffix Array and LCP(-LR)
                            
                                C function argument, memory alignment considerations
                            
                                strcasecmp() : A Non-Standard Function?
                            
                                configure: error: cannot run C compiled programs
                            
                                Is there difference between these two expressions?
                            
                                Is the alignment of char in C (and C++) guaranteed to be 1? [duplicate]
                            
                                Undefined or unspecified behaviour?
                            
                                Loading DLL from a location in memory

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is there a standard, strided version of memcpy?

Tags:

c

memcpy

stride

M. Tibbits

People also ask

1 Answers

Dietrich Epp

Recent Activity

Donate For Us