Differences

This shows you the differences between two versions of the page.

--- wget:download_a_full_website [2016/10/18 09:10] – peter
+++ wget:download_a_full_website [2019/12/04 22:35] (current) – removed peter
@@ Line 1: / Line 1: @@
-====== wget - Download a Full Website ======
-To download a full website and make it available for local viewing.
-<code bash>
-wget --mirror -p --convert-links -P ./LOCAL-DIR WEBSITE-URL
-</code>
-  * <nowiki>--mirror</nowiki> : turn on options suitable for mirroring.
-  * -p : download all files that are necessary to properly display a given HTML page.
-  * <nowiki>--convert-links</nowiki> : after the download, convert the links in document for local viewing.
-  * -P ./LOCAL-DIR : save all the files and directories to the specified directory.
-===== Download files recursively with wget =====
-With wget you can download files directly in a shell. If you want to download a whole site, known as downloading recursively, you can set the r option.
-<code bash>
-wget -r http://somesite.com
-</code>
-By default wget respects the **robots.txt** file and thus only downloads the non-private files.  The protocol of the robots exclusion standard is pure advisory, this means that the robots.txt contains rules that a search engine or other robots are not allowed to access certain files but they might ignore them.
-Wget can be adviced to ignore that rules and thus it downloads the private files anyway. Set the e option as shown next.
-<code bash>
-wget -e robots=off -r http://somesite.com
-</code>