Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

os.path.basename() is inconsistent and I'm not sure why

While creating a program that backs up my files, I found that os.path.basename() was not working consistently. For example:

import os

folder = '\\\\server\\studies\\backup\\backup_files'
os.path.basename(folder)

returns 'backup_files'

folder = '\\\\server\\studies'
os.path.basename(folder)

returns ''

I want that second basename function to return 'studies' but it returns an empty string. I ran os.path.split(folder) to see how it's splitting the string and it turns out it's considering the entire path to be the directory, i.e. ('\\\\server\\studies', ' ').

I can't figure out how to get around it.. The weirdest thing is I ran the same line earlier and it worked, but it won't anymore! Does it have something to do with the very first part being a shared folder on the network drive?

like image 619
Manar Avatar asked Mar 06 '19 06:03

Manar


1 Answers

that looks like a Windows UNC specificity

UNC paths can be seen as equivalent of unix path, only with double backslashes at the start.

A workaround would be to use classical rsplit:

>>> r"\\server\studies".rsplit(os.sep,1)[-1]
'studies'

Fun fact: with 3 paths it works properly:

>>> os.path.basename(r"\\a\b\c")
'c'

Now why this? let's check the source code of ntpath on windows:

def basename(p):
    """Returns the final component of a pathname"""
    return split(p)[1]

okay now split

def split(p):
    seps = _get_bothseps(p)
    d, p = splitdrive(p)

now splitdrive

def splitdrive(p):
    """Split a pathname into drive/UNC sharepoint and relative path specifiers.
    Returns a 2-tuple (drive_or_unc, path); either part may be empty.

Just reading the documentation makes us understand what's going on.

A Windows sharepoint has to contain 2 path parts:

\\server\shareroot

So \\server\studies is seen as the drive, and the path is empty. Doesn't happen when there are 3 parts in the path.

Note that it's not a bug, since it's not possible to use \\server like a normal directory, create dirs below, etc...

Note that the official documentation for os.path.basename doesn't mention that (because os.path calls ntpath behind the scenes) but it states:

Return the base name of pathname path. This is the second element of the pair returned by passing path to the function split(). Note that the result of this function is different from the Unix basename program

That last emphasised part at least is true! (and the documentation for os.path.split() doesn't mention that issue or even talks about windows)

like image 95
Jean-François Fabre Avatar answered Sep 28 '22 12:09

Jean-François Fabre