How to list an image sequence in an efficient way? Numercial sequence comparison in Python

Tags:

I have a directory of 9 images:

image_0001, image_0002, image_0003
image_0010, image_0011
image_0011-1, image_0011-2, image_0011-3
image_9999

I would like to be able to list them in an efficient way, like this (4 entries for 9 images):

(image_000[1-3], image_00[10-11], image_0011-[1-3], image_9999)

Is there a way in python, to return a directory of images, in a short/clear way (without listing every file)?

So, possibly something like this:

list all images, sort numerically, create a list (counting each image in sequence from start). When an image is missing (create a new list), continue until original file list is finished. Now I should just have some lists that contain non broken sequences.

I'm trying to make it easy to read/describe a list of numbers. If I had a sequence of 1000 consecutive files It could be clearly listed as file[0001-1000] rather than file['0001','0002','0003' etc...]

Edit1(based on suggestion): Given a flattened list, how would you derive the glob patterns?

Edit2 I'm trying to break the problem down into smaller pieces. Here is an example of part of the solution: data1 works, data2 returns 0010 as 64, data3 (the realworld data) doesn't work:

# Find runs of consecutive numbers using groupby.  The key to the solution
# is differencing with a range so that consecutive numbers all appear in
# same group.
from operator import itemgetter
from itertools import *

data1=[01,02,03,10,11,100,9999]
data2=[0001,0002,0003,0010,0011,0100,9999]
data3=['image_0001','image_0002','image_0003','image_0010','image_0011','image_0011-2','image_0011-3','image_0100','image_9999']

list1 = []
for k, g in groupby(enumerate(data1), lambda (i,x):i-x):
    list1.append(map(itemgetter(1), g))
print 'data1'
print list1

list2 = []
for k, g in groupby(enumerate(data2), lambda (i,x):i-x):
    list2.append(map(itemgetter(1), g))
print '\ndata2'
print list2

returns:

data1
[[1, 2, 3], [10, 11], [100], [9999]]

data2
[[1, 2, 3], [8, 9], [64], [9999]]

296

asked Oct 13 '10 18:10

user178686

1 Answers

Here is a working implementation of what you want to achieve, using the code you added as a starting point:

#!/usr/bin/env python

import itertools
import re

# This algorithm only works if DATA is sorted.
DATA = ["image_0001", "image_0002", "image_0003",
        "image_0010", "image_0011",
        "image_0011-1", "image_0011-2", "image_0011-3",
        "image_0100", "image_9999"]

def extract_number(name):
    # Match the last number in the name and return it as a string,
    # including leading zeroes (that's important for formatting below).
    return re.findall(r"\d+$", name)[0]

def collapse_group(group):
    if len(group) == 1:
        return group[0][1]  # Unique names collapse to themselves.
    first = extract_number(group[0][1])  # Fetch range
    last = extract_number(group[-1][1])  # of this group.
    # Cheap way to compute the string length of the upper bound,
    # discarding leading zeroes.
    length = len(str(int(last)))
    # Now we have the length of the variable part of the names,
    # the rest is only formatting.
    return "%s[%s-%s]" % (group[0][1][:-length],
        first[-length:], last[-length:])

groups = [collapse_group(tuple(group)) \
    for key, group in itertools.groupby(enumerate(DATA),
        lambda(index, name): index - int(extract_number(name)))]

print groups

This prints ['image_000[1-3]', 'image_00[10-11]', 'image_0011-[1-3]', 'image_0100', 'image_9999'], which is what you want.

HISTORY: I initially answered the question backwards, as @Mark Ransom pointed out below. For the sake of history, my original answer was:

You're looking for glob. Try:

import glob
images = glob.glob("image_[0-9]*")

Or, using your example:

images = [glob.glob(pattern) for pattern in ("image_000[1-3]*",
    "image_00[10-11]*", "image_0011-[1-3]*", "image_9999*")]
images = [image for seq in images for image in seq]  # flatten the list

197

answered Oct 03 '22 00:10

Frédéric Hamidi

Related questions
                            
                                XML library similar to simplejson/json? - Python
                            
                                how to create URL extractor like facebook share
                            
                                How do you create a mdb database file in Python?
                            
                                Drawback to catch-all exception (at highest program level, followed by re-raising, just to log before exiting?)
                            
                                Which of `if x:` or `if x != 0:` is preferred in Python?
                            
                                GAE: Best way to determine how many of a Kind is stored?
                            
                                How can one find Class of calling function in python?
                            
                                Writing binary data to middle of a sparse file
                            
                                How to search through a gtk.ListStore in pyGTK and remove elements?
                            
                                Idiomatic way of taking action on attempt to loop over an empty iterable
                            
                                Running multiple instances of a python program efficiently & economically?
                            
                                How to break from a Python generator with open file handles
                            
                                Python - Make Script to Manipulate Windows File Paths but running on Linux
                            
                                Working around Python bug in different versions
                            
                                python 2.6.x theading / signals /atexit fail on some versions?
                            
                                Is it safe to replace MacOS X default Python interpreter?
                            
                                Python lxml and stdin
                            
                                Best way to change the value of "settings" from within a Python test case?
                            
                                MT940 format parser
                            
                                Installing the igraph package for python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to list an image sequence in an efficient way? Numercial sequence comparison in Python

Tags:

python

regex

glob

user178686

People also ask

1 Answers

Frédéric Hamidi

Recent Activity

Donate For Us