What is the best way to move data from array of CameraSpacePoint to array of PointXYZ?
struct CameraSpacePoint
{
float X;
float Y;
float Z;
};
__declspec(align(16))
struct PointXYZ
{
float x;
float y;
float z;
};
constexpr int BIG_VAL = 1920 * 1080;
CameraSpacePoint camera_space_points[BIG_VAL];
PointXYZ points_xyz[BIG_VAL];
My solution:
CameraSpacePoint* camera_space_points_ptr = &camera_space_points[0];
PointXYZ* points_xyz_ptr = &points_xyz[0];
for (int i = 0; i < BIG_VAL; ++i)
{
memcpy(points_xyz_ptr++, camera_space_points_ptr++, sizeof(CameraSpacePoint));
}
Is this the most efficient way?
As always, readability and maintainability trumps other concerns. Write what you mean, and don't fix what isn't a problem: measure before you optimize.
std::transform(camera_space_points, std::end(camera_space_points), points_xyz,
[](auto c){
return PointXYZ{c.X, c.Y, c.Z};
});
This is what you should always write as default. By their assembly output and a quick benchmark, this is pretty much equivalent to the memcpy
version.
On a more hand-wavy note, optimizers are really good at micro-optimizing simple code such as copying a large chunk of memory, manual optimizations are rarely better.
An alternative is making sure you copy chunks of 16 bytes. That way the copy can be optimized better in terms of instructions that copy 16 bytes at once (without any surrounding shuffles or other unnecessary complications), if they exist (they exist on x64 so on eg Xbox One and PC this can help). The PointXYZ
s will be 16 bytes, so writing 16 bytes to them is fine. The source has elements of 12 bytes, so every time one of them is copied this way there is also 4 bytes from the next element in it, they end up in the padding of the target PointXYZ
and will be ignored. The last CameraSpacePoint
does not necessarily have 4 readable bytes after it, it might end just before an unmapped/unreadable memory region, so there we need to be careful to not read further - unless that array can be extended a little to guarantee that the memory exists.
For example:
auto dst = ::dst;
auto src = ::src;
for (int i = 0; i + 1 < BIG_VAL; ++i)
std::memcpy(dst++, src++, 16);
// last point is special, since the src may not have 16 bytes left to read
std::memcpy(dst, src, sizeof(CameraSpacePoint));
(on godbolt)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With