Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

compare two lists of files, ignoring file extension in one list

Tags:

python

I have two lists

list1 = ['image1.png', 'image2.png', 'image3.png', 'image3.png']
list2 = ['image1.pdf', 'image2.eps', 'image3.ps']

I want to create a list which contains the names of list1, if the name (ignoring the extension) is contained in list2. For the example above the correct answer would be

['image1.png', 'image2.png', 'image3.png']

any idea how to do this? thanks carl

like image 488
carl Avatar asked Nov 30 '25 03:11

carl


2 Answers

from os.path import splitext

list1 = ['image1.png', 'image2.png', 'image3.png', 'image3.png', 'image4.png', 'image3.jpg']
list2 = ['image1.pdf', 'image2.eps', 'image3.ps', 'image5.doc']

# Create a lookup set of the document names sans extensions.
documents = set([splitext(filename)[0] for filename in list2])

# Compare each stripped filename in list1 to the list of stripped document filenames.
matches = [filename for filename in set(list1) if splitext(filename)[0] in documents]

print matches

Output:

['image1.png', 'image2.png', 'image3.png', 'image3.jpg']

Note that it would have to be adapted for files with multiple extensions like .tar.gz if needed (filename.partition(".")[0] would do the trick). But that would mean that dots cannot be put anywhere in the filename because the first dot now delimits the extension.

like image 100
Scott A Avatar answered Dec 02 '25 17:12

Scott A


Use a list comprehension with set:

list1 = ["image1.png", "image2.png", "image3.png", "image3.png"]
list2 = ["image1.pdf", "image2.eps", "image3.ps"]

print [x for x in set(list1) for y in set(list2) if x.split('.')[0] == y.split('.')[0]]

Output:

['image1.png', 'image2.png', 'image3.png']
like image 38
Andrés Pérez-Albela H. Avatar answered Dec 02 '25 16:12

Andrés Pérez-Albela H.