Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

cross-platform splitting of path in python

Tags:

python

I'd like something that has the same effect as this:

>>> path = "/foo/bar/baz/file" >>> path_split = path.rsplit('/')[1:] >>> path_split ['foo', 'bar', 'baz', 'file'] 

But that will work with Windows paths too. I know that there is an os.path.split() but that doesn't do what I want, and I didn't see anything that does.

like image 716
Aaron Yodaiken Avatar asked Jan 02 '11 18:01

Aaron Yodaiken


People also ask

How do you split a path in Python?

path. split() method in Python is used to Split the path name into a pair head and tail. Here, tail is the last path name component and head is everything leading up to that.

How do you split paths?

The Split-Path cmdlet returns only the specified part of a path, such as the parent folder, a subfolder, or a file name. It can also get items that are referenced by the split path and tell whether the path is relative or absolute. You can use this cmdlet to get or submit only a selected part of a path.

What is path CWD ()?

In Pathlib, the Path. cwd() function is used to get the current working directory and / operator is used in place of os. path.

Should I use pathlib?

The Python pathlib module provides an easier method to interact with the filesystem no matter what the operating system is. It allows a more intuitive, more pythonic way to interface with file paths (the name of a file including any of its directories and subdirectories). In the os module, paths are regular strings.


2 Answers

Python 3.4 introduced a new module pathlib. pathlib.Path provides file system related methods, while pathlib.PurePath operates completely independent of the file system:

>>> from pathlib import PurePath >>> path = "/foo/bar/baz/file" >>> path_split = PurePath(path).parts >>> path_split ('\\', 'foo', 'bar', 'baz', 'file') 

You can use PosixPath and WindowsPath explicitly when desired:

>>> from pathlib import PureWindowsPath, PurePosixPath >>> PureWindowsPath(path).parts ('\\', 'foo', 'bar', 'baz', 'file') >>> PurePosixPath(path).parts ('/', 'foo', 'bar', 'baz', 'file') 

And of course, it works with Windows paths as well:

>>> wpath = r"C:\foo\bar\baz\file" >>> PurePath(wpath).parts ('C:\\', 'foo', 'bar', 'baz', 'file') >>> PureWindowsPath(wpath).parts ('C:\\', 'foo', 'bar', 'baz', 'file') >>> PurePosixPath(wpath).parts ('C:\\foo\\bar\\baz\\file',) >>> >>> wpath = r"C:\foo/bar/baz/file" >>> PurePath(wpath).parts ('C:\\', 'foo', 'bar', 'baz', 'file') >>> PureWindowsPath(wpath).parts ('C:\\', 'foo', 'bar', 'baz', 'file') >>> PurePosixPath(wpath).parts ('C:\\foo', 'bar', 'baz', 'file') 

Huzzah for Python devs constantly improving the language!

like image 54
John Crawford Avatar answered Sep 21 '22 21:09

John Crawford


The OP specified "will work with Windows paths too". There are a few wrinkles with Windows paths.

Firstly, Windows has the concept of multiple drives, each with its own current working directory, and 'c:foo' and 'c:\\foo' are often not the same. Consequently it is a very good idea to separate out any drive designator first, using os.path.splitdrive(). Then reassembling the path (if required) can be done correctly by drive + os.path.join(*other_pieces)

Secondly, Windows paths can contain slashes or backslashes or a mixture. Consequently, using os.sep when parsing an unnormalised path is not useful.

More generally:

The results produced for 'foo' and 'foo/' should not be identical.

The loop termination condition seems to be best expressed as "os.path.split() treated its input as unsplittable".

Here's a suggested solution, with tests, including a comparison with @Spacedman's solution

import os.path  def os_path_split_asunder(path, debug=False):     parts = []     while True:         newpath, tail = os.path.split(path)         if debug: print repr(path), (newpath, tail)         if newpath == path:             assert not tail             if path: parts.append(path)             break         parts.append(tail)         path = newpath     parts.reverse()     return parts  def spacedman_parts(path):     components = []      while True:         (path,tail) = os.path.split(path)         if not tail:             return components         components.insert(0,tail)  if __name__ == "__main__":     tests = [         '',         'foo',         'foo/',         'foo\\',         '/foo',         '\\foo',         'foo/bar',         '/',         'c:',         'c:/',         'c:foo',         'c:/foo',         'c:/users/john/foo.txt',         '/users/john/foo.txt',         'foo/bar/baz/loop',         'foo/bar/baz/',         '//hostname/foo/bar.txt',         ]     for i, test in enumerate(tests):         print "\nTest %d: %r" % (i, test)         drive, path = os.path.splitdrive(test)         print 'drive, path', repr(drive), repr(path)         a = os_path_split_asunder(path)         b = spacedman_parts(path)         print "a ... %r" % a         print "b ... %r" % b         print a == b 

and here's the output (Python 2.7.1, Windows 7 Pro):

Test 0: '' drive, path '' '' a ... [] b ... [] True  Test 1: 'foo' drive, path '' 'foo' a ... ['foo'] b ... ['foo'] True  Test 2: 'foo/' drive, path '' 'foo/' a ... ['foo', ''] b ... [] False  Test 3: 'foo\\' drive, path '' 'foo\\' a ... ['foo', ''] b ... [] False  Test 4: '/foo' drive, path '' '/foo' a ... ['/', 'foo'] b ... ['foo'] False  Test 5: '\\foo' drive, path '' '\\foo' a ... ['\\', 'foo'] b ... ['foo'] False  Test 6: 'foo/bar' drive, path '' 'foo/bar' a ... ['foo', 'bar'] b ... ['foo', 'bar'] True  Test 7: '/' drive, path '' '/' a ... ['/'] b ... [] False  Test 8: 'c:' drive, path 'c:' '' a ... [] b ... [] True  Test 9: 'c:/' drive, path 'c:' '/' a ... ['/'] b ... [] False  Test 10: 'c:foo' drive, path 'c:' 'foo' a ... ['foo'] b ... ['foo'] True  Test 11: 'c:/foo' drive, path 'c:' '/foo' a ... ['/', 'foo'] b ... ['foo'] False  Test 12: 'c:/users/john/foo.txt' drive, path 'c:' '/users/john/foo.txt' a ... ['/', 'users', 'john', 'foo.txt'] b ... ['users', 'john', 'foo.txt'] False  Test 13: '/users/john/foo.txt' drive, path '' '/users/john/foo.txt' a ... ['/', 'users', 'john', 'foo.txt'] b ... ['users', 'john', 'foo.txt'] False  Test 14: 'foo/bar/baz/loop' drive, path '' 'foo/bar/baz/loop' a ... ['foo', 'bar', 'baz', 'loop'] b ... ['foo', 'bar', 'baz', 'loop'] True  Test 15: 'foo/bar/baz/' drive, path '' 'foo/bar/baz/' a ... ['foo', 'bar', 'baz', ''] b ... [] False  Test 16: '//hostname/foo/bar.txt' drive, path '' '//hostname/foo/bar.txt' a ... ['//', 'hostname', 'foo', 'bar.txt'] b ... ['hostname', 'foo', 'bar.txt'] False 
like image 44
John Machin Avatar answered Sep 22 '22 21:09

John Machin