I use wget to download an entire website. I used the follwing command (in windows 7): <pre class="prettyprint"><code>wget ^ --recursive ^ -A "*thread*, *label*" ^ --no-clobber ^ --page-requisites ^ --html-extension ^ --domains example.com ^ --random-wait ^ --no-parent ^ --background ^ --header="Accept: text/html" --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0" ^ http://example.com/ </code></pre> After 2 days my little brother restarted the PC so I tried to resume the stopped process I added the following to the command <pre class="prettyprint"><code>--continue ^ </code></pre> so the code looks like <pre class="prettyprint"><code>wget ^ --recursive ^ -A "*thread*, *label*" ^ --no-clobber ^ --page-requisites ^ --html-extension ^ --domains example.com ^ --random-wait ^ --no-parent ^ --background ^ --continue ^ --header="Accept: text/html" --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0" ^ http://example.com/ </code></pre> unfortunately it started a new job it downloads the same files again and write a new log file named <pre class="prettyprint"><code>wget-log.1 </code></pre> Is there anyway to resume mirroring site with wget or do have I to start the whole thing over again?

Try -nc option. It checks everything once again, but doesn't download it. I'm using this code to download one website: <code>wget -r -t1 domain.com -o log</code> I've stopped the process, I wanted to resume it, so I changed the code: <code>wget -nc -r -t1 domain.com -o log</code> In the logs there is something like this: <code>File .... already there; not retrieving. etc. </code> I checked logs before this and it seems that after maybe 5 minutes of this kind of checking it begins to download new files. I'm using this manual for wget: http://www.linux.net.pl/~wkotwica/doc/wget/wget_8.html

how to resume wget mirroring website?

Tags:

cmd

wget

web-scraping

web-crawler

I use wget to download an entire website.
I used the follwing command (in windows 7):

wget ^
 --recursive ^
 -A "*thread*, *label*" ^
 --no-clobber ^
 --page-requisites ^
 --html-extension ^
 --domains example.com ^
 --random-wait ^
 --no-parent ^
 --background ^
 --header="Accept: text/html" --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0" ^
     http://example.com/

After 2 days my little brother restarted the PC
so I tried to resume the stopped process
I added the following to the command

--continue ^

so the code looks like

wget ^
     --recursive ^
     -A "*thread*, *label*" ^
     --no-clobber ^
     --page-requisites ^
     --html-extension ^
     --domains example.com ^
     --random-wait ^
     --no-parent ^
     --background ^
     --continue ^
     --header="Accept: text/html" --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0" ^
         http://example.com/

unfortunately it started a new job it downloads the same files again and write a new log file named

wget-log.1

Is there anyway to resume mirroring site with wget or do have I to start the whole thing over again?

806

asked May 04 '15 01:05

Abdalla Mohamed Aly Ibrahim

1 Answers

Try -nc option. It checks everything once again, but doesn't download it.

I'm using this code to download one website: wget -r -t1 domain.com -o log

I've stopped the process, I wanted to resume it, so I changed the code: wget -nc -r -t1 domain.com -o log

In the logs there is something like this: File .... already there; not retrieving. etc.

I checked logs before this and it seems that after maybe 5 minutes of this kind of checking it begins to download new files.

I'm using this manual for wget: http://www.linux.net.pl/~wkotwica/doc/wget/wget_8.html

120

answered Oct 12 '22 09:10

jack daniels

Related questions
                            
                                How to pass STDIN to child process?
                            
                                stop gulp from changing shell/cmd title name
                            
                                Windows Java child process doesn't input or output when set to parent's standard IO (Command Prompt)
                            
                                Windows 7 Command Prompt: How do I execute a batch script from the command line?
                            
                                Anaconda Prompt Stuck/Closing after Keras installation
                            
                                Using python with Anaconda in Windows
                            
                                Eclipse terminal trims characters after a certain point. Limit width size
                            
                                How to fix errors occurring on installation of Jupyter Notebook?
                            
                                How to run ./configure on windows
                            
                                Ping request time out although web browser works on same computer
                            
                                CFEXECUTE assigning it to run with administrator rights
                            
                                How can I set the size of the cmd.exe window from my Perl program?
                            
                                List all tags within a module in CVS in CLI
                            
                                Why are spaces being inserted into this batch file at runtime?
                            
                                pagination with the python cmd module
                            
                                Windows BAT or CMD: send some data to a localhost udp port
                            
                                Docker save images twice the size when using powershell - saving raw byte streams
                            
                                How to do proper Unicode and ANSI output redirection on cmd.exe?
                            
                                Using variables in a file path directory in a batch file
                            
                                php exec() in unicode mode?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With