Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert path to Mac OS X path, the almost-NFD normal form?

Macs normally operate on the HFS+ file system which normalizes paths. That is, if you save a file with accented é in it (u'\xe9') for example, and then do a os.listdir you will see that the filename got converted to u'e\u0301'. This is normal unicode NFD normalization that the Python unicodedata module can handle. Unfortunately HFS+ is not fully consistent with NFD, meaning some paths will not be normalized, for example 福 (u'\ufa1b') will not be changed, although its NFD form is u'\u798f'.

So, how to do the normalization in Python? I would be fine using native APIs as long as I can call them from Python.

like image 742
Heikki Toivonen Avatar asked Aug 08 '13 22:08

Heikki Toivonen


People also ask

How do I get Pathname on Mac?

Show the path to a file or folder On your Mac, click the Finder icon in the Dock to open a Finder window. Choose View > Show Path Bar, or press the Option key to show the path bar momentarily. The location and nested folders that contain your file or folder are displayed near the bottom of the Finder window.

How do you copy a file path on a Macbook Pro?

So how can you actually copy the file path name?) While holding down the Control button, click on the file you want to copy the path of in Finder. Press the Option key (In the menu that appears after step one, you'll see Copy turn into Copy [file path name] as Pathname) Click Copy [file path name] as Pathname.


1 Answers

Well, decided to write out the Python solution, since the related other question I pointed to was more Objective-C.

First you need to install https://pypi.python.org/pypi/pyobjc-core and https://pypi.python.org/pypi/pyobjc-framework-Cocoa. Then following should work:

import sys

from Foundation import NSString, NSAutoreleasePool

def fs_normalize(path):
    _pool = NSAutoreleasePool.alloc().init()
    normalized_path = NSString.fileSystemRepresentation(path)
    upath = unicode(normalized_path, sys.getfilesystemencoding() or 'utf8')
    return upath

if __name__ == '__main__':
    e = u'\xe9'
    j = u'\ufa1b'
    e_expected = u'e\u0301'

    assert fs_normalize(e) == e_expected
    assert fs_normalize(j) == j

Note that NSString.fileSystemRepresentation() seems to also accept str input. I had some cases where it was returning garbage in that case, so I think it would be just safer to use it with unicode. It always returns str type, so you need to convert back to unicode.

like image 169
Heikki Toivonen Avatar answered Sep 28 '22 12:09

Heikki Toivonen