Every so often I will remember some interesting or unique website that I had seen some time ago, I'll try to visit it only to find that it has gone offline. If that has ever happened to you then you need to know how to mirror sites offline.
Mirroring a site is just downloading all the files on it so you can view them locally. This is easy to do for small sites that are light on media (it takes up more space) and don't rely on lots of javascript.
To mirror a site first install wget
, a universal download utility from GNU, if it isn't in your repos you should get another OS. Now use the command
wget --mirror --convert-links --page-requisites --no-parent [URL]
This should download the entire site into a directory with the name of the URL.
--convert-links
This changes all the hyperlinks to point to documents in the download directory instead of the web.
--page-requisites
This tells wget to also pull stylesheets and javascript files.
--no-parent
This prevents wget from moving up the file system. This is also good if you only want to download part of a site.
Now you should be able to mirror and archive all the small sites you find online.