I am trying to create a sorted list of files in the ./pages
directory. This is what I have so far:
import numpy as np
from PIL import Image
import glob
from pathlib import Path
# sorted( l, key=lambda a: int(a.split("-")[1]) )
image_list = []
for filename in Path('./pages').glob('*.jpg'):
# sorted( i, key=lambda a: int(a.split("_")[1]) )
# im=Image.open(filename)
image_list.append(filename)
print(*image_list, sep = "\n")
current output:
pages/page_1.jpg
pages/page_10.jpg
pages/page_11.jpg
pages/page_12.jpg
pages/page_2.jpg
pages/page_3.jpg
pages/page_4.jpg
pages/page_5.jpg
pages/page_6.jpg
pages/page_7.jpg
pages/page_8.jpg
pages/page_9.jpg
Expected Output:
pages/page_1.jpg
pages/page_2.jpg
pages/page_3.jpg
pages/page_4.jpg
pages/page_5.jpg
pages/page_6.jpg
pages/page_7.jpg
pages/page_8.jpg
pages/page_9.jpg
pages/page_10.jpg
pages/page_11.jpg
pages/page_12.jpg
I've tried the solutions found in the duplicate, but they don't work because the pathlib files are class objects, and not strings. They only appear as filenames when I print them.
For example:
print(filename) # pages/page_1.jpg
print(type(filename)) # <class 'pathlib.PosixPath'>
Finally, this is working code. Thanks to all.
from pathlib import Path
import numpy as np
from PIL import Image
import natsort
def merge_to_single_image():
image_list1 = []
image_list2 = []
image_list3 = []
image_list4 = []
for filename in Path('./pages').glob('*.jpg'):
image_list1.append(filename)
for i in image_list1:
image_list2.append(i.stem)
# print(type(i.stem))
image_list3 = natsort.natsorted(image_list2, reverse=False)
for i in image_list3:
i = str(i)+ ".jpg"
image_list4.append(Path('./pages', i))
images = [Image.open(i) for i in image_list4]
# for a vertical stacking it is simple: use vstack
images_combined = np.vstack(images)
images_combined = Image.fromarray(images_combined)
images_combined.save('Single_image.jpg')
You can use sorted with a lambda . For the sorting criteria, you can use os to first pull just the file name (using basename ), then you can split off just the filename less the extension (using splitext ). Lastly convert to int so you sort numerically instead of lexicographically.
The pathlib is a Python module which provides an object API for working with files and directories. The pathlib is a standard module. Path is the core object to work with files.
With pathlib , file paths can be represented by proper Path objects instead of plain strings as before. These objects make code dealing with file paths: Easier to read, especially because / is used to join paths together. More powerful, with most necessary methods and properties available directly on the object.
In this article, I have introduced another Python built-in library, the Pathlib. It is considered to be more advanced, convenient and provides more stunning features than the OS library.
Just for posterity, maybe this is more succinct?
natsorted(list_of_pathlib_objects, key=str)
Note that sorted
doesn't sort your data in place, but returns a new list, so you have to iterate on its output.
In order to get your sorting key, which is the integer value at the end of your filename:
You can first take the stem of your path, which is its final component without extension (so, for example, 'page_13'
).
Then, it is better to split it once from the right, in order to be safe in case your filename contains other underscores in the first part, like 'some_page_33.jpg'
.
Once converted to int
, you have the key you want for sorting.
So, your code could look like:
for filename in sorted(Path('./pages').glob('*.jpg'),
key=lambda path: int(path.stem.rsplit("_", 1)[1])):
print(filename)
Sample output:
pages/ma_page_2.jpg
pages/ma_page_11.jpg
pages/ma_page_13.jpg
pages/ma_page_20.jpg
One can use natsort lib (pip install natsort. It should look simple too.
[! This works, at least tested for versions 5.5 and 7.1 (current)]
from natsort import natsorted
image_list = Path('./pages').glob('*.jpg')
# convert list of paths to list of string and (naturally)sort it, then convert back to list of paths
image_list = [Path(p) for p in natsorted([str(p) for p in image_list ])]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With