Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract file name from path, no matter what the os/path format

Tags:

python

Which Python library can I use to extract filenames from paths, no matter what the operating system or path format could be?

For example, I'd like all of these paths to return me c:

a/b/c/ a/b/c \a\b\c \a\b\c\ a\b\c a/b/../../a/b/c/ a/b/../../a/b/c 
like image 681
BuZz Avatar asked Dec 05 '11 11:12

BuZz


People also ask

How do I separate filenames from path?

To get the file name from the path, use the os. path. basename() method. Working with UNIX or MacOS uses the slash / as path separator, and Windows uses the backslash \ as the separator.

What is OS path Getsize filename?

path. getsize() method in Python is used to get the size of a file. If the file doesn't exist at the specified path, this method raises a FileNotFoundError exception.

What is the use of OS path name file in this method?

path. dirname() method in Python is used to get the directory name from the specified path.


2 Answers

Actually, there's a function that returns exactly what you want

import os print(os.path.basename(your_path)) 

WARNING: When os.path.basename() is used on a POSIX system to get the base name from a Windows styled path (e.g. "C:\\my\\file.txt"), the entire path will be returned.

Example below from interactive python shell running on a Linux host:

Python 3.8.2 (default, Mar 13 2020, 10:14:16) [GCC 9.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> filepath = "C:\\my\\path\\to\\file.txt" # A Windows style file path. >>> os.path.basename(filepath) 'C:\\my\\path\\to\\file.txt' 
like image 161
stranac Avatar answered Oct 13 '22 22:10

stranac


Using os.path.split or os.path.basename as others suggest won't work in all cases: if you're running the script on Linux and attempt to process a classic windows-style path, it will fail.

Windows paths can use either backslash or forward slash as path separator. Therefore, the ntpath module (which is equivalent to os.path when running on windows) will work for all(1) paths on all platforms.

import ntpath ntpath.basename("a/b/c") 

Of course, if the file ends with a slash, the basename will be empty, so make your own function to deal with it:

def path_leaf(path):     head, tail = ntpath.split(path)     return tail or ntpath.basename(head) 

Verification:

>>> paths = ['a/b/c/', 'a/b/c', '\\a\\b\\c', '\\a\\b\\c\\', 'a\\b\\c',  ...     'a/b/../../a/b/c/', 'a/b/../../a/b/c'] >>> [path_leaf(path) for path in paths] ['c', 'c', 'c', 'c', 'c', 'c', 'c'] 


(1) There's one caveat: Linux filenames may contain backslashes. So on linux, r'a/b\c' always refers to the file b\c in the a folder, while on Windows, it always refers to the c file in the b subfolder of the a folder. So when both forward and backward slashes are used in a path, you need to know the associated platform to be able to interpret it correctly. In practice it's usually safe to assume it's a windows path since backslashes are seldom used in Linux filenames, but keep this in mind when you code so you don't create accidental security holes.

like image 45
Lauritz V. Thaulow Avatar answered Oct 13 '22 22:10

Lauritz V. Thaulow