Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract a filename from a URL & append a word to it?

Tags:

python

django

I have the following url:

url = http://photographs.500px.com/kyle/09-09-201315-47-571378756077.jpg

I would like to extract the file name in this url: 09-09-201315-47-571378756077.jpg

Once I get this file name, I'm going to save it with this name to the Desktop.

filename = **extracted file name from the url**      download_photo = urllib.urlretrieve(url, "/home/ubuntu/Desktop/%s.jpg" % (filename)) 

After this, I'm going to resize the photo, once that is done, I've going to save the resized version and append the word "_small" to the end of the filename.

downloadedphoto = Image.open("/home/ubuntu/Desktop/%s.jpg" % (filename))                resize_downloadedphoto = downloadedphoto.resize.((300, 300), Image.ANTIALIAS) resize_downloadedphoto.save("/home/ubuntu/Desktop/%s.jpg" % (filename + _small)) 

From this, what I am trying to achieve is to get two files, the original photo with the original name, then the resized photo with the modified name. Like so:

09-09-201315-47-571378756077.jpg

09-09-201315-47-571378756077_small.jpg

How can I go about doing this?

like image 419
deadlock Avatar asked Sep 10 '13 19:09

deadlock


People also ask

How do I get the filename from a URL?

The filename is the last part of the URL from the last trailing slash. For example, if the URL is http://www.example.com/dir/file.html then file. html is the file name.

How do I extract file names?

To extract filename from the file, we use “GetFileName()” method of “Path” class. This method is used to get the file name and extension of the specified path string. The returned value is null if the file path is null. Syntax: public static string GetFileName (string path);


2 Answers

You can use urllib.parse.urlparse with os.path.basename:

import os from urllib.parse import urlparse  url = "http://photographs.500px.com/kyle/09-09-201315-47-571378756077.jpg" a = urlparse(url) print(a.path)                    # Output: /kyle/09-09-201315-47-571378756077.jpg print(os.path.basename(a.path))  # Output: 09-09-201315-47-571378756077.jpg 
like image 96
Ofir Israel Avatar answered Nov 01 '22 01:11

Ofir Israel


os.path.basename(url)

Why try harder?

In [1]: os.path.basename("https://example.com/file.html") Out[1]: 'file.html'  In [2]: os.path.basename("https://example.com/file") Out[2]: 'file'  In [3]: os.path.basename("https://example.com/") Out[3]: ''  In [4]: os.path.basename("https://example.com") Out[4]: 'example.com' 

Note 2020-12-20

Nobody has thus far provided a complete solution.

A URL can contain a ?[query-string] and/or a #[fragment Identifier] (but only in that order: ref)

In [1]: from os import path  In [2]: def get_filename(url):    ...:     fragment_removed = url.split("#")[0]  # keep to left of first #    ...:     query_string_removed = fragment_removed.split("?")[0]    ...:     scheme_removed = query_string_removed.split("://")[-1].split(":")[-1]    ...:     if scheme_removed.find("/") == -1:    ...:         return ""    ...:     return path.basename(scheme_removed)    ...:  In [3]: get_filename("a.com/b") Out[3]: 'b'  In [4]: get_filename("a.com/") Out[4]: ''  In [5]: get_filename("https://a.com/") Out[5]: ''  In [6]: get_filename("https://a.com/b") Out[6]: 'b'  In [7]: get_filename("https://a.com/b?c=d#e") Out[7]: 'b' 
like image 28
P i Avatar answered Nov 01 '22 02:11

P i