Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert a filename to a file:// URL

In WeasyPrint’s public API I accept filenames (among other types) for the HTML inputs. Any filename that works with the built-in open() should work, but I need to convert it to an URL in the file:// scheme that will later be passed to urllib.urlopen().

(Everything is in URL form internally. I need to have a "base URL" for documents in order to resolve relative URL references with urlparse.urljoin().)

urllib.pathname2url is a start:

Convert the pathname path from the local syntax for a path to the form used in the path component of a URL. This does not produce a complete URL. The return value will already be quoted using the quote() function.

The emphasis is mine, but I do need a complete URL. So far this seems to work:

def path2url(path):     """Return file:// URL from a filename."""     path = os.path.abspath(path)     if isinstance(path, unicode):         path = path.encode('utf8')     return 'file:' + urlparse.pathname2url(path) 

UTF-8 seems to be recommended by RFC 3987 (IRI). But in this case (the URL is meant for urllib, eventually) maybe I should use sys.getfilesystemencoding()?

However, based on the literature I should prepend not just file: but file:// ... except when I should not: On Windows the results from nturl2path.pathname2url() already start with three slashes.

So the question is: is there a better way to do this and make it cross-platform?

like image 986
Simon Sapin Avatar asked Jul 27 '12 12:07

Simon Sapin


People also ask

Is file a URL?

file is a registered URI scheme (for "Host-specific file names"). So yes, file URIs are URLs.

How do you write the filename in Python?

Python supports writing files by default, no special modules are required. You can write a file using the . write() method with a parameter containing text data. Before writing data to a file, call the open(filename,'w') function where filename contains either the filename or the path to the filename.


2 Answers

For completeness, in Python 3.4+, you should do:

import pathlib  pathlib.Path(absolute_path_string).as_uri() 
like image 190
ToBeReplaced Avatar answered Sep 21 '22 15:09

ToBeReplaced


I'm not sure the docs are rigorous enough to guarantee it, but I think this works in practice:

import urlparse, urllib  def path2url(path):     return urlparse.urljoin(       'file:', urllib.pathname2url(path)) 
like image 36
Dave Abrahams Avatar answered Sep 18 '22 15:09

Dave Abrahams