I use vImageConvert_RGB888toPlanar8
and vImageConvert_Planar8toRGB888
from Accelerate.framework to convert RGB24 to BGR24, but when the data need to transform is very big, such as 3M or 4M, the time need to spend on this is about 10ms. So some one know some fast enough idea?.My code like this:
- (void)transformRGBToBGR:(const UInt8 *)pict{
rgb.data = (void *)pict;
vImage_Error error = vImageConvert_RGB888toPlanar8(&rgb,&red,&green,&blue,kvImageNoFlags);
if (error != kvImageNoError) {
NSLog(@"vImageConvert_RGB888toARGB8888 error");
}
error = vImageConvert_Planar8toRGB888(&blue,&green,&red,&bgr,kvImageNoFlags);
if (error != kvImageNoError) {
NSLog(@"vImagePermuteChannels_ARGB8888 error");
}
free((void *)pict);
}
With a RGB888ToPlanar8 call you scatter the data and then gather it once again. This is very-very-very bad. If the memory overhead of 33% is affordable, try using the RGBA format and permute the B/R bytes in-place.
If you want to save 33% percents, then I might suggest the following. Iterate all the pixels, but read only a multiple of 4 bytes (since lcm(3,4) is 12, that is 3 dwords).
uint8_t* src_image;
uint8_t* dst_image;
uint32_t* src = (uint32_t*)src_image;
uint32_t* dst = (uint32_t*)dst_image;
uint32_t v1, v2, v3;
uint32_t nv1, nv2, nv3;
for(int i = 0 ; i < num_pixels / 12 ; i++)
{
// read 12 bytes
v1 = *src++;
v2 = *src++;
v3 = *src++;
// shuffle bits in the pixels
// [R1 G1 B1 R2 | G2 B2 R3 G3 | B3 R4 G4 B4]
nv1 = // [B1 G1 R1 B2]
((v1 >> 8) & 0xFF) | (v1 & 0x00FF0000) | ((v1 >> 16) & 0xFF) | ((v2 >> 24) & 0xFF);
nv2 = // [G2 R2 B3 G3]
...
nv3 = // [R3 B4 G4 R4]
...
// write 12 bytes
*dst++ = nv1;
*dst++ = nv2;
*dst++ = nv3;
}
Even better can be done with NEON intrinsics.
See this link from ARM's website to see how the 24-bit swapping is done.
The BGR-to-RGB can be done in-place like this:
void neon_asm_convert_BGR_TO_RGB(uint8_t* img, int numPixels24)
{
// numPixels is divided by 24
__asm__ volatile(
"0: \n"
"# load 3 64-bit regs with interleave: \n"
"vld3.8 {d0,d1,d2}, [%0] \n"
"# swap d0 and d2 - R and B\n"
"vswp d0, d2 \n"
"# store 3 64-bit regs: \n"
"vst3.8 {d0,d1,d2}, [%0]! \n"
"subs %1, %1, #1 \n"
"bne 0b \n"
:
: "r"(img), "r"(numPixels24)
: "r4", "r5"
);
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With