User Tools

Site Tools


ubuntu:wget:download_a_full_website

Ubuntu - wget - Download a Full Website

To download a full website and make it available for local viewing.

wget --mirror -p --convert-links -P ./LOCAL-DIR WEBSITE-URL
  • --mirror : turn on options suitable for mirroring.
  • -p : download all files that are necessary to properly display a given HTML page.
  • --convert-links : after the download, convert the links in document for local viewing.
  • -P ./LOCAL-DIR : save all the files and directories to the specified directory.

Download files recursively with wget

With wget you can download files directly in a shell. If you want to download a whole site, known as downloading recursively, you can set the r option.

wget -r http://somesite.com

By default wget respects the robots.txt file and thus only downloads the non-private files. The protocol of the robots exclusion standard is pure advisory, this means that the robots.txt contains rules that a search engine or other robots are not allowed to access certain files but they might ignore them.

Wget can be adviced to ignore that rules and thus it downloads the private files anyway. Set the e option as shown next.

wget -e robots=off -r http://somesite.com
ubuntu/wget/download_a_full_website.txt · Last modified: 2020/07/15 09:30 by 127.0.0.1

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki