Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

efficient TIFF tile extraction C++

I am working with 1gb large tiff images of around 20000 x 20000 pixels. I need to extract several tiles (of about 300x300 pixels) out of the images, in random positions.

I tried the following solutions:

  • Libtiff (the only low level library I could find) offers TIFFReadline() but that means reading in around 19700 unnecesary pixels.

  • I implemented my own tiff reader which extracts a tile out of the image without reading in unnecesary pixels. I expected it to be faster, but doing a seekg for every line of the tile makes it very slow. I also tried reading to a buffer all the lines of the file that include my tile, and then extracting the tile from the buffer, but results are more or less the same.

I'd like to receive suggestions that would improve my tile extraction tool!

Everything is welcome, maybe you can propose a more efficient library I could use, some tips about C/C++ I/O, some higher level strategy for my needs, etc.

Regards, Juan

like image 873
Juan Avatar asked Oct 30 '09 17:10

Juan


2 Answers

[Major edit 14 Jan 10]

I was a bit confused by your mention of tiles, when the tiff is not tiled.

I do use tiled/pyramidical TIFF images. I've created those with VIPS

vips im_vips2tiff source_image output_image.tif:none,tile:256x256,pyramid

I think you can do this with :

vips im_vips2tiff source_image output_image.tif:none,tile:256x256,flat

You may want to experiment with tile size. Then you can read using TIFFReadEncodedTile.

Multi-resolution storage using pyramidical tiffs are much faster if you need to zoom in/out. You may also want to use this to have a coarse image nearly immediately followed by a detailed picture.

After switching to (appropriately sized) tiled storage (which will bring you MASSIVE performance improvements for random access!), your bottleneck will be disk io. File read is much faster if read in sequence. Here mmapping may be the solution.

Some useful links:

VIPS IIPImage LibTiff.NET stackoverflow VIPS is a image handling library which can do much more than just read/write. It has its own, very efficient internal format. It has a good documentation on the algorithms. For one, it decouples processing from filesystem, thereby allowing tiles to be cached.

IIPImage is a multi-zoom webserver/browser library. I found the documentation a very good source of information on multi-resolution imaging (like google maps)

The other solution on this page, using mmap, is efficient only for 'small' files. I've hit the 32-bit boundaries often. Generally, allocating a 1 GByte chunk of memory will fail on a 32-bit os (with 4 GBytes RAM installed) due to the fact that even virtual memory gets fragemented after one or two application runs. Still, there is sufficient memory to cache parts or whole of the image. More memory = more performance.

like image 178
Adriaan Avatar answered Sep 21 '22 01:09

Adriaan


Just mmap your file.

http://www.kernel.org/doc/man-pages/online/pages/man2/mmap.2.html

like image 33
Gaetano Mendola Avatar answered Sep 18 '22 01:09

Gaetano Mendola