Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to speed up libjpeg decompression

We are using libjpeg for JPEG decoding on our small embedded platform. We have problems with speed when we decode large images. For example, image which is 20 MB large and has dimensions of 5000x3000 pixels needs 10 seconds to load.

I need some tips on how to improve decoding speed. On other platform with similar performance, I have the same image load in two seconds.

Best decrease from 14 seconds to 10 seconds we got from using larger read buffer (64 kB instead of default 4 kB). But nothing else helped.

We do not need to display image in full resolution, so we use scale_num and scale_denom to display it in smaller size. But I would like to have more performance. Is it possible to use some kind of multithreading etc.? Different decoding settings? Anything, I ran of ideas.

like image 384
Bogi Avatar asked Feb 14 '23 17:02

Bogi


1 Answers

First - profile the code. You're left with little more than speculation if you cannot definitively identify the bottlenecks.

Next, scour the documentation for libjpeg speedup opportunities. You mentioned scale_num and scale_denom. What about the decompressor's dct_method? I've found that the DCT_FASTEST option is good. There are other options to check: do_fancy_upsampling, do_block_smoothing, dither_mode, two_pass_quantize, etc. Some of all of these may be useful to you, depending on your system, libjpeg version, etc.

If profiling tools are unavailable, there are still some things to try. First, I suspect your bottlenecks are non-CPU related. To confirm, load the uncompressed image into a RAM buffer, then decompress it from there as you have been. Did that significantly improve the decompression time? If so, the culprit would appear to be the read operation from your image storage media. Depending on your system, reading from USB (or SD, etc) can be slow. (Note that I'm assuming a read from external media - although hardware details are scant.) Be sure to optimize relevant bus parameters, as well (SPI clocks, configurations, etc).

If you are reading from something like internal flash (i.e. NAND), there are some other things to inspect. How is your NAND controller configured? Have you ensured that the controller is configured for the fastest operation? Check wait states, timings, etc. Note that bus and/or memory contention can be an issue, too - so inspect their respective configurations, as well.

Finally, if you believe your system is actually CPU bound, this stackoverflow question may be of interest: Can a high-performance jpeglib-turbo implmentation decompress/compress in <100ms?

like image 156
Throwback1986 Avatar answered Feb 20 '23 00:02

Throwback1986