Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Self-describing file format for gigapixel images?

Tags:

image

tiles

tiff

In medical imaging, there appears to be two ways of storing huge gigapixel images:

  1. Use lots of JPEG images (either packed into files or individually) and cook up some bizarre index format to describe what goes where. Tack on some metadata in some other format.

  2. Use TIFF's tile and multi-image support to cleanly store the images as a single file, and provide downsampled versions for zooming speed. Then abuse various TIFF tags to store metadata in non-standard ways. Also, store tiles with overlapping boundaries that must be individually translated later.

In both cases, the reader must understand the format well enough to understand how to draw things and read the metadata.

Is there a better way to store these images? Is TIFF (or BigTIFF) still the right format for this? Does XMP solve the problem of metadata?

The main issues are:

  • Storing images in a way that allows for rapid random access (tiling)
  • Storing downsampled images for rapid zooming (pyramid)
  • Handling cases where tiles are overlapping or sparse (scanners often work by moving a camera over a slide in 2D and capturing only where there is something to image)
  • Storing important metadata, including associated images like a slide's label and thumbnail
  • Support for lossy storage

What kind of (hopefully non-proprietary) formats do people use to store large aerial photographs or maps? These images have similar properties.

like image 808
Adam Goode Avatar asked Dec 15 '09 05:12

Adam Goode


4 Answers

It seems like starting with TIFF or BigTIFF and defining a useful subset of tags + XMP metadata might be the way to go. FITS is no good since it is basically for lossless data and doesn't have a very appropriate metadata mechanism.

The problem with TIFF is that it just allows too much flexibility, but a subset of TIFF should be acceptable.

The solution may very well be http://ome-xml.org/ and http://ome-xml.org/wiki/OmeTiff.

It looks like DICOM now has support: ftp://medical.nema.org/MEDICAL/Dicom/Final/sup145_ft.pdf

like image 86
Adam Goode Avatar answered Nov 13 '22 08:11

Adam Goode


You probably want FITS.

  • Arbitrary size
  • 1--3 dimensional data
  • Extensive header
  • Widely used in astronomy and endorsed by NASA and the IAU
like image 23
dmckee --- ex-moderator kitten Avatar answered Nov 13 '22 09:11

dmckee --- ex-moderator kitten


I'm a pathologist (and hobbyist programmer) so virtual slides and digital pathology are a huge interest of mine. You may be interested in the OpenSlide project. They have characterized a number of the proprietary formats from the large vendors (Aperio, BioImagene, etc). Most seem to consist of a pyramidal zoomed (scanned at different microscopic objectives, of course), large tiff files containing multiple tiled tiffs or compressed (JPEG or JPEG2000) images.

like image 40
DrBloodmoney Avatar answered Nov 13 '22 09:11

DrBloodmoney


The industry standard is DICOM Sup 145; getting vendors to adopt it though has been sluggish, but inventing yet another format would probably not be helpful.

like image 37
David Clunie Avatar answered Nov 13 '22 07:11

David Clunie