So I'm downloading files with WGET and I want to check if the file exsists before I download it. I know with the CLI version it has an option to: (see example).
# check if file exsists
# if not, download
wget.download(url, path)
With WGET it downloads the file without needing to name it. This is important because I don't want to rename the files when they already have a name.
If there is an alternative file downloading method that allows for checking for exsisting files please tell me! Thanks!!!
wget.download()
doesn't have any such option. The following workaround should do the trick for you:
import subprocess
url = "https://url/to/index.html"
path = "/path/to/save/your/files"
subprocess.run(["wget", "-r", "-nc", "-P", path, url])
If the file is already there, you will get the following message:
File ‘index.html’ already there; not retrieving.
EDIT:
If you are running this on Windows, you'd also have to include shell=True
:
subprocess.run(["wget", "-r", "-nc", "-P", path, url], shell=True)
I don't see that the python module has that option.
You could try to guess the filename that will be used (typically it will be the part of the url after the last slash character).
Or you could download the file to a new temporary directory and then check if that filename exists in your main directory.
From the source code, the wget.download()
function doesn't seem to have the option for additional parameters such as -nc
or -N
for skipping downloads if the file already exists. Only the CLI version seems to support this.
The function:
def download(url, out=None, bar=bar_adaptive):
...
You are only able to choose the url and the output directory
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With