The Best Tools for Saving Web Pages, Forever

Web pages change or may even disappear with time. Thus if you would like to preserve a web page forever, you should either need to download that page to your computer (and put it on Dropbox) or you could use a web archiving service that will safely store a copy of that page on their own servers, permanently.

There are quite a few ways to save web pages permanently and your choice of the tool will depend on the kind of web content that you are trying to archive.


Archive Web Pages, Permanently

If you are essentially interested in the saving text-only content, like news articles, Pocket and Instapaper are recommended choices. You can save pages via email, browser extensions, bookmarklets or through apps. These services extract the text content from a public web page and make it available on all your devices. However, there’s no option to download the saved articles, you can only read them on Pocket website or their mobile apps.
Evernote and OneNote are impressing tools for archiving web content in your own private notebooks. They provide web clippers (or extensions) that make it easy for you to save complete web pages – from tutorials to recipes to your online transactions receipts – with a click. The clipped web pages can be accessed from any device, the original layout is retained (mostly) and everything is searchable – these services can even perform OCR to find the text inside photographs. Evernote also lets your export these saved pages as HTML files that you can upload elsewhere.
If you prefer something quick and simple that works everywhere but doesn’t require extensions, you can consider saving web pages as PDF files. Google Chrome has a built-in PDF writer or you can use Google Cloud Print. It add a new “Save to Google Drive” virtual printer and the next time you print a page on our desktop or mobile through Cloud Print, it will save a PDF copy of the page directly in your Drive. This is however not the best choice for saving pages with complex formatting.
When the layout is important, your best bet is to use a screen capture tool. You’re obviously spoilt for choices here but I’d recommend the official Chrome add-on from Google – it will not only capture full-length screenshots of web page but it will also upload the image to your Google Drive in the same step. The add-on can also save web pages in the web archive (MHT) format that is natively supported in both IE and Firefox.
The Wayback Machine of the Internet Archive is a perfect place for finding previous versions of web pages but the same tool can be used to save any web page on-demand as well. Go to archive.org/web and enter the URL of any public web pages in the input box. The archiver will download a full copy of the page, including all the images and assets, on their server. It will make a permanent archive of the page that looks exactly like the original and will stay even if the original page goes offline.
Internet Archive doesn’t offer an option to download saved pages but Archive.Is can be a good alternative. It is very similar to archive.org in the sense that you enter the page URL and it will make an exact snapshot of the web page on their server. The page will be stored online forever but here you also have the option to download the saved page as a ZIP file. It too provides date based archives so you can have multiple snapshots of the same page for different days.
All popular web browsers provide an option to download a complete web page to your computer. It will download the HTML web page as well as the associated images, CSS and JavaScript to your computer so you can read it offline. You’ll however have to put effort in organizing these archives as the saved content may not be searchable through your desktop search programs.
eReader owners can use dotEPUB to download any web page as an ePUB or MOBI ebook, formats that are compatible with most readers. Amazon offers a Kindle add-on to help you save any web page in your Kindle device but, as with Pocket, these tools are primarily for archiving text based web content.
Most of the tools discussed above allow you to download a single page but if you wish to save a set of URLs in bulk, wget may be your savior. We also have a Google Script for downloading web pages to Drive automatically (like a cron job) but it will get the HTML content and nothing else.

No comments:

Post a Comment