wiki:ArchivingSites
Last modified 3 years ago Last modified on 03/18/14 13:00:33

Some things to think about when archiving an old site:

Lots of time and energy goes into creating content, please think carefully before making decisions to throw content away. What may appear to have no value to you might well have value to someone else -- archives do have value. This is not a new idea, please read Web Pages Must Live Forever and Cool URIs don't change.

Archiving sites well takes some time, if it is rushed then some things that should be considered might be missed.

Static HTML is the best form for archives as it doesn't require maintenance, however if mass edits need to be made then they need to be done to the dynamic site before it it archived.

HTTrack is a great tool for archiving sites, it is in Debian and can be run on the command line, in screen, on the server where the archive is to live.

Forms, eg contact forms and search forms won't work on the static archive, these forms are best removed before the archiving is done, or they can be hidden with CSS afterwards, see ticket:698#comment:7

Error pages, some URLs will change when the site is archived, best set up custom error pages to catch these.

Webbugs, best remove any GA or other such bugs before archiving, in addition if the archive is to have Piwik stats then best add the Piwik code before creating the archive.