Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

image loading performance problems with python and gobject

I have a script with a GTK(GObject) interface I use for posting to my photo blog.

I'm trying to improve it's responsiveness by loading the images in a background thread.

I've had no luck trying to populate GdkPixbuf objects from a background thread, everything I've tried just jams solid.

So as an alternate I thought I'd read the files in the background thread and then push them into GdkPixbuf's on demand. This approach has yielded some surprising and rather depressing performance results which make me wonder if I'm doing something grossly wrong.

I'm playing with lightly compressed jpegs off my camera, they tend to be around 3.8mb.

Here's the original blocking image load:

pb = GdkPixbuf.Pixbuf.new_from_file(image_file)

This averages about 550ms, not huge, but rather tedious if you want to flick through a dozen images.

Then I split it up, here's the file read:

data = bytearray(open(self.image_file).read())

This averages 15ms, that's really nice, but also kinda worrying, if we can read the file in 15ms what are the other 535ms being spent on?

Incidentally the bytearray call exists because the PixBufLoader wouldn't accept the data otherwise.

And then the Pixbuf load:

pbl = GdkPixbuf.PixbufLoader()
pbl.write(data, len(data))
pbl.close()
pb = pbl.get_pixbuf()

This averages around 1400ms, which is nearly 3 times longer than letting Gtk do it all.

Am I doing something wrong here?

like image 744
Gordon Wrigley Avatar asked Apr 30 '11 05:04

Gordon Wrigley


2 Answers

My guess: you are doing something wrong. I've just compared libjpeg-turbo with gdk.PixbufLoader and found virtually no speed differences. The code I used is below.

For the libjpeg-turbo (jpegload.c):

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>

#include <jpeglib.h>

void decompress(FILE* fd)
{
  JSAMPARRAY buffer;
  int row_stride;
  struct jpeg_decompress_struct cinfo;
  struct jpeg_error_mgr jerr;
  cinfo.err = jpeg_std_error(&jerr);
  jpeg_create_decompress(&cinfo);
  jpeg_stdio_src(&cinfo, fd);
  jpeg_read_header(&cinfo, TRUE);
  jpeg_start_decompress(&cinfo);
  row_stride = cinfo.output_width * cinfo.output_components;
  buffer = (*cinfo.mem->alloc_sarray)
                ((j_common_ptr) &cinfo, JPOOL_IMAGE, row_stride, 1);
  while (cinfo.output_scanline < cinfo.output_height) {
    (void) jpeg_read_scanlines(&cinfo, buffer, 1);
  }
  jpeg_finish_decompress(&cinfo);
  jpeg_destroy_decompress(&cinfo);
}

int main(int argc, char** argv)
{
  long len;
  FILE *fd;
  unsigned char *buf;
  struct timeval start, end;
  int i;
  const int N = 100;
  int delta;

  /* read file to cache it in memory */
  assert(argc == 2);
  fd = fopen(argv[1], "rb");
  fseek(fd, 0, SEEK_END);
  len = ftell(fd);
  rewind(fd);
  buf = malloc(len);
  assert(buf != NULL);
  assert(fread(buf, 1, len, fd) == len);

  gettimeofday(&start, NULL);
  for(i = 0; i < N; i++) {
    rewind(fd);
    decompress(fd);
  }
  gettimeofday(&end, NULL);
  if(end.tv_sec > start.tv_sec) {
    delta = (end.tv_sec - start.tv_sec - 1) * 1000;
    end.tv_usec += 1000000;
  }
  delta += (end.tv_usec - start.tv_usec) / 1000;
  printf("time spent in decompression: %d msec\n",
         delta/N);
}

For python gdk (gdk_load.py):

import sys
import gtk
import time

def decompress(data):
    pbl = gtk.gdk.PixbufLoader()
    pbl.write(data, len(data))
    pbl.close()
    return pbl.get_pixbuf()

data = open(sys.argv[1]).read()

N = 100
start = time.time()
for i in xrange(N):
    decompress(data)
end = time.time()
print "time spent in decompression: %d msec" % int((end - start) * 1000 / N)

Test run results:

$ gcc jpegload.c -ljpeg
$ ./a.out DSC_8450.JPG 
time spent in decompression: 75 msec
$ python gdk_load.py DSC_8450.JPG 
time spent in decompression: 75 msec
$ identify DSC_8450.JPG 
DSC_8450.JPG JPEG 3008x2000 3008x2000+0+0 8-bit DirectClass 2.626MB 0.000u 0:00.019

EDIT: and another test, using gi.repostiroy this time:

import sys
import time
from gi.repository import GdkPixbuf

def decompress(filename):
    pb = GdkPixbuf.Pixbuf.new_from_file(filename)
    return pb

N = 100
start = time.time()
for i in xrange(N):
    decompress(sys.argv[1])
end = time.time()
print "time spent in decompression: %d msec" % int((end - start) * 1000 / N)

And results:

$ python gi_load.py DSC_8450.JPG 
time spent in decompression: 74 msec

GdkPixbuf.PixbufLoader using gi.repository is really much MUCH slower then "pure" gtk.gdk. Code:

import sys
import time
from gi.repository import GdkPixbuf

def decompress(data):
    pbl = GdkPixbuf.PixbufLoader()
    pbl.write(data, len(data))
    pbl.close()
    return pbl.get_pixbuf()

data = bytearray(open(sys.argv[1]).read())

N = 100
start = time.time()
for i in xrange(N):
    decompress(data)
end = time.time()
print "time spent in decompression: %d msec" % int((end - start) * 1000 / N)

Results:

$ python gi_load.py DSC_8450.JPG 
time spent in decompression: 412 msec

But GdkPixbuf.Pixbuf.new_from_file works as fast as pure C version even using gi.repository, so you are still either doing something wrong, or expecting too much.

like image 189
abbot Avatar answered Oct 01 '22 11:10

abbot


I have developed a small image viewer with pygtk. I use PixbufLoader, but I feed only N bytes per write(). In combination with idle_add() I can load an image in background, while the application still responses to user input.

Here is the source: http://guettli.sourceforge.net/gthumpy/src/ImageCache.py

like image 28
guettli Avatar answered Oct 01 '22 13:10

guettli