When I use either the <code>-P</code> or <code>-O</code> alone with <code>wget</code>, everything works as advertised. <pre class="prettyprint"><code>$: wget -P "test" http://www.google.com/intl/en_com/images/srpr/logo3w.png Saving to: `test/logo3w.png' </code></pre> . <pre class="prettyprint"><code>$: wget -O "google.png" http://www.google.com/intl/en_com/images/srpr/logo3w.png 2012-01-23 21:47:33 (1.20 MB/s) - `google.png' saved [7007/7007] </code></pre> However, combining the two causes <code>wget</code> to ignore <code>-P</code>. <pre class="prettyprint"><code>$: wget -P "test" -O "google.png" http://www.google.com/intl/en_com/images/srpr/logo3w.png 2012-01-23 21:47:51 (5.87 MB/s) - `google.png' saved [7007/7007] </code></pre> I've set a variable for both the directory (generated by the last chunk of the URL) and the filename (generated through a counting loop) such that <code>http://www.google.com/aaa/bbb/ccc</code> yields <code>file</code> = <code>/directory/filename</code>, or, for item 1, <code>/ccc/000.jpg</code> When substituting this in to the code: <code>Popen(['wget', '-O', file, theImg], stdout=PIPE, stderr=STDOUT)</code> <code>wget</code> silently fails (on each iteration of the loop). When I turn on debugging <code>-d</code> and logging <code>-a log.log</code>, each iteration prints <code>DEBUG output created by Wget 1.13.4 on darwin10.8.0.</code> When I remove the <code>-O</code> and <code>file</code>, the operation proceeds normally. My question is: Is there a way to A) Specify both <code>-P</code> AND <code>-O</code> in <code>wget</code> (preferred) or B) Insert a string to <code>-O</code> containing <code>/</code>-characters that doesn't cause it to fail? Any help would be appreciated.

Documentation of wget.download(..): <pre class="prettyprint"><code>def download(url, out=None, bar=bar_adaptive): """High level function, which downloads URL into tmp file in current directory and then renames it to filename autodetected from either URL or HTTP headers. :param bar: function to track download progress (visualize etc.) :param out: output filename or directory :return: filename where URL is downloaded to """ ... </code></pre> Use the following call to download file to a specific directory(already existing) with custom filename: <pre class="prettyprint"><code>wget.download(url, path_to_output_file) </code></pre> If you want a function call to abstract away the directory creation if already not existing, then use: <pre class="prettyprint"><code>urllib.urlretrieve(url, path_to_output_file) </code></pre>

You should just pass <code>dir/000.jpg</code> to <code>-O</code> of <code>wget</code>: <pre class="prettyprint"><code>import subprocess import os.path subprocess.Popen(['wget', '-O', os.path.join(directory, filename), theImg]) </code></pre> It's not completely clear from your question whether you were already doing something similar to this, but if you were and it still failed, I can think of two reasons: <ul> <li>The argument to <code>-O</code> contains a leading <code>/</code>, making <code>wget</code> fail because it doesn't have permission to randomly create directories in <code>/</code> (root).</li> <li>The directory you're telling <code>wget</code> to write to doesn't exist. You can make sure it exists by creating it first using <code>os.mkdir</code> in the Python standard library.</li> </ul> You can also try removing the arguments <code>stdout=</code> and <code>stderr=</code> from the <code>Popen</code> call so you can see the errors directly, or print them using Python.

wget: How do I specify both --directory-prefix AND --output-document

Tags:

python

wget

When I use either the -P or -O alone with wget, everything works as advertised.

$: wget -P "test" http://www.google.com/intl/en_com/images/srpr/logo3w.png
Saving to: `test/logo3w.png'

$: wget -O "google.png" http://www.google.com/intl/en_com/images/srpr/logo3w.png
2012-01-23 21:47:33 (1.20 MB/s) - `google.png' saved [7007/7007]

However, combining the two causes wget to ignore -P.

$: wget -P "test" -O "google.png" http://www.google.com/intl/en_com/images/srpr/logo3w.png
2012-01-23 21:47:51 (5.87 MB/s) - `google.png' saved [7007/7007]

I've set a variable for both the directory (generated by the last chunk of the URL) and the filename (generated through a counting loop) such that http://www.google.com/aaa/bbb/ccc yields file = /directory/filename, or, for item 1, /ccc/000.jpg

When substituting this in to the code:
Popen(['wget', '-O', file, theImg], stdout=PIPE, stderr=STDOUT)
wget silently fails (on each iteration of the loop).

When I turn on debugging -d and logging -a log.log, each iteration prints
DEBUG output created by Wget 1.13.4 on darwin10.8.0.

When I remove the -O and file, the operation proceeds normally.

My question is: Is there a way to
A) Specify both -P AND -O in wget (preferred) or
B) Insert a string to -O containing /-characters that doesn't cause it to fail?

Any help would be appreciated.

959

asked Jan 24 '12 06:01

Josh Whittington

2 Answers

Documentation of wget.download(..):

def download(url, out=None, bar=bar_adaptive):
    """High level function, which downloads URL into tmp file in current
    directory and then renames it to filename autodetected from either URL
    or HTTP headers.

    :param bar: function to track download progress (visualize etc.)
    :param out: output filename or directory
    :return:    filename where URL is downloaded to
    """
    ...

Use the following call to download file to a specific directory(already existing) with custom filename:

wget.download(url, path_to_output_file)

If you want a function call to abstract away the directory creation if already not existing, then use:

urllib.urlretrieve(url, path_to_output_file)

answered Oct 12 '22 11:10

Jaydev

You should just pass dir/000.jpg to -O of wget:

import subprocess
import os.path

subprocess.Popen(['wget', '-O', os.path.join(directory, filename), theImg])

It's not completely clear from your question whether you were already doing something similar to this, but if you were and it still failed, I can think of two reasons:

The argument to -O contains a leading /, making wget fail because it doesn't have permission to randomly create directories in / (root).
The directory you're telling wget to write to doesn't exist. You can make sure it exists by creating it first using os.mkdir in the Python standard library.

You can also try removing the arguments stdout= and stderr= from the Popen call so you can see the errors directly, or print them using Python.

answered Oct 12 '22 11:10

Rob Wouters

Related questions
                            
                                How to filter chinese (ONLY chinese)
                            
                                What library to use to extract text from images (OCR)? [closed]
                            
                                Special considerations when performing file I/O on an NFS share via a Python-based daemon?
                            
                                Getting stdout from a tcpdump subprocess after terminating it
                            
                                pyodbc returns SQL Server DATE fields as strings
                            
                                What good homework style tutorials are recommended for learning functional programming in Python? [closed]
                            
                                How do I abort a socket.recvfrom() from another thread in python?
                            
                                How would you answer this django interview question?
                            
                                How can I determine which version of virtualenvwrapper is installed?
                            
                                How does MongoEngine handle Indexes (creation, update, removal)?
                            
                                How to create a Python class decorator that is able to wrap instance, class and static methods?
                            
                                String reverse in Python
                            
                                Can I statically link Cython modules into an executable which embeds python?
                            
                                How to enable math in sphinx?
                            
                                Vim python support with non system python
                            
                                Preferable way to automatically update SSH config file using Python?
                            
                                Why does select.select() work with disk files but not epoll()?
                            
                                How to pass a function pointer to an external program in Cython
                            
                                Python imaging library show() on Windows
                            
                                In python's argparse module, how can I disable printing subcommand choices between curly brackets?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With