Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to list an image sequence in an efficient way? Numercial sequence comparison in Python

Tags:

python

regex

glob

I have a directory of 9 images:

image_0001, image_0002, image_0003
image_0010, image_0011
image_0011-1, image_0011-2, image_0011-3
image_9999

I would like to be able to list them in an efficient way, like this (4 entries for 9 images):

(image_000[1-3], image_00[10-11], image_0011-[1-3], image_9999)

Is there a way in python, to return a directory of images, in a short/clear way (without listing every file)?

So, possibly something like this:

list all images, sort numerically, create a list (counting each image in sequence from start). When an image is missing (create a new list), continue until original file list is finished. Now I should just have some lists that contain non broken sequences.

I'm trying to make it easy to read/describe a list of numbers. If I had a sequence of 1000 consecutive files It could be clearly listed as file[0001-1000] rather than file['0001','0002','0003' etc...]

Edit1(based on suggestion): Given a flattened list, how would you derive the glob patterns?

Edit2 I'm trying to break the problem down into smaller pieces. Here is an example of part of the solution: data1 works, data2 returns 0010 as 64, data3 (the realworld data) doesn't work:

# Find runs of consecutive numbers using groupby.  The key to the solution
# is differencing with a range so that consecutive numbers all appear in
# same group.
from operator import itemgetter
from itertools import *

data1=[01,02,03,10,11,100,9999]
data2=[0001,0002,0003,0010,0011,0100,9999]
data3=['image_0001','image_0002','image_0003','image_0010','image_0011','image_0011-2','image_0011-3','image_0100','image_9999']

list1 = []
for k, g in groupby(enumerate(data1), lambda (i,x):i-x):
    list1.append(map(itemgetter(1), g))
print 'data1'
print list1

list2 = []
for k, g in groupby(enumerate(data2), lambda (i,x):i-x):
    list2.append(map(itemgetter(1), g))
print '\ndata2'
print list2

returns:

data1
[[1, 2, 3], [10, 11], [100], [9999]]

data2
[[1, 2, 3], [8, 9], [64], [9999]]
like image 296
user178686 Avatar asked Oct 13 '10 18:10

user178686


People also ask

What is a sequence structure in Python?

A sequence is a positionally ordered collection of items. And you can refer to any item in the sequence by using its index number e.g., s[0] and s[1] . In Python, the sequence index starts at 0, not 1. So the first element is s[0] and the second element is s[1] . If the sequence s has n items, the last item is s[n-1] .

Which of the following objects in Python are built on the idea of sequence?

We have been introduced to three Python types that are sequential in nature: strings, lists, and tuples. Among these, lists are the only mutable objects.

What is sequence in image processing?

Work with sequences of images and perform batch processing of images. An image sequence is a collection of images related by time, such as frames in a movie, or by spatial location, such as magnetic resonance imaging (MRI) slices. Image sequences are also known as image stacks or videos.


1 Answers

Here is a working implementation of what you want to achieve, using the code you added as a starting point:

#!/usr/bin/env python

import itertools
import re

# This algorithm only works if DATA is sorted.
DATA = ["image_0001", "image_0002", "image_0003",
        "image_0010", "image_0011",
        "image_0011-1", "image_0011-2", "image_0011-3",
        "image_0100", "image_9999"]

def extract_number(name):
    # Match the last number in the name and return it as a string,
    # including leading zeroes (that's important for formatting below).
    return re.findall(r"\d+$", name)[0]

def collapse_group(group):
    if len(group) == 1:
        return group[0][1]  # Unique names collapse to themselves.
    first = extract_number(group[0][1])  # Fetch range
    last = extract_number(group[-1][1])  # of this group.
    # Cheap way to compute the string length of the upper bound,
    # discarding leading zeroes.
    length = len(str(int(last)))
    # Now we have the length of the variable part of the names,
    # the rest is only formatting.
    return "%s[%s-%s]" % (group[0][1][:-length],
        first[-length:], last[-length:])

groups = [collapse_group(tuple(group)) \
    for key, group in itertools.groupby(enumerate(DATA),
        lambda(index, name): index - int(extract_number(name)))]

print groups

This prints ['image_000[1-3]', 'image_00[10-11]', 'image_0011-[1-3]', 'image_0100', 'image_9999'], which is what you want.

HISTORY: I initially answered the question backwards, as @Mark Ransom pointed out below. For the sake of history, my original answer was:

You're looking for glob. Try:

import glob
images = glob.glob("image_[0-9]*")

Or, using your example:

images = [glob.glob(pattern) for pattern in ("image_000[1-3]*",
    "image_00[10-11]*", "image_0011-[1-3]*", "image_9999*")]
images = [image for seq in images for image in seq]  # flatten the list
like image 197
Frédéric Hamidi Avatar answered Oct 03 '22 00:10

Frédéric Hamidi