I tried different ways to download a site and finally I found the wayback machine downloader - which was built by Hartator (so all credits go to him, please), but I simply did not notice his comment to the question. To save you time, I decided to add the wayback_machine_downloader gem as a separate answer here.
The site at http://www.archiveteam.org/index.php?title=Restoring lists these ways to download from archive.org:
- Wayback Machine Downloader, small tool in Ruby to download any website from the Wayback Machine. Free and open-source. My choice! As of 2025, the original project hadn't received updates for 3 years and stopped working at least for some sites, StrawberryMaster/wayback-machine-downloader is a fork that worked better.
- Warrick - Main site seems down.
- Wayback downloaders - a service that will download your site from the Wayback Machine and even add a plugin for WordPress. Not free.
Videos
I tried different ways to download a site and finally I found the wayback machine downloader - which was built by Hartator (so all credits go to him, please), but I simply did not notice his comment to the question. To save you time, I decided to add the wayback_machine_downloader gem as a separate answer here.
The site at http://www.archiveteam.org/index.php?title=Restoring lists these ways to download from archive.org:
- Wayback Machine Downloader, small tool in Ruby to download any website from the Wayback Machine. Free and open-source. My choice! As of 2025, the original project hadn't received updates for 3 years and stopped working at least for some sites, StrawberryMaster/wayback-machine-downloader is a fork that worked better.
- Warrick - Main site seems down.
- Wayback downloaders - a service that will download your site from the Wayback Machine and even add a plugin for WordPress. Not free.
This can be done using a bash shell script combined with wget.
The idea is to use some of the URL features of the wayback machine:
http://web.archive.org/web/*/http://domain/*will list all saved pages fromhttp://domain/recursively. It can be used to construct an index of pages to download and avoid heuristics to detect links in webpages. For each link, there is also the date of the first version and the last version.http://web.archive.org/web/YYYYMMDDhhmmss*/http://domain/pagewill list all version ofhttp://domain/pagefor year YYYY. Within that page, specific links to versions can be found (with exact timestamp)http://web.archive.org/web/YYYYMMDDhhmmssid_/http://domain/pagewill return the unmodified pagehttp://domain/pageat the given timestamp. Notice the id_ token.
These are the basics to build a script to download everything from a given domain.
Hi there,
I'm trying to download this book here: https://archive.org/details/Wasserscheiden
How do I do that (beyond taking a screenshot for every page..)? Is there a download option at all??
Thanks in advance for your help!
Here's a bookmarklet method to download borrowed "scanned" books from archive.org if anyone's interested. Naturally make sure to delete the files once your book borrow time has expired!
https://gist.github.com/cemerson/043d3b455317d762bb1378aeac3679f3
At the button of the page you have the option to download it as PDF.
Archive.org is a fantastic source for all kinds of data. It even makes it convenient to download by supplying a .torrent with every submission!
The only problem is, the torrents almost never function correctly on archives with lots of files.
It’s a good thing Archive made a Python tool to download and upload directly from their servers!
However, on many of the archives I’ve tried, it fails to download files surprisingly often. Each file is represented by a letter, and ‘e’ means that there was an error of some kind. It’s not a problem with my internet, because some of the archives download fine, and some just don’t.
Here’s how you can download files from Archive consistently, without any problems.
-
Firstly, install Internet Download Manager, if you don’t have it already.
-
Then, using Chrome, install Linkclump. In its Options menu, create a new action (or edit the existing action, if there is one) and set Right mouse button to Copy selected links to clipboard.
-
Next, open the Archive page you want to download from. In Download Options, select Show All. Right click and drag to select everything you want to download in the list, and paste it into Notepad. Save it as a text document.
-
Finally, open Internet Download Manager. Under Tasks, select Import > From text file and select the .txt document with all the Archive links. It will give you a list of the links found, and prompt you to check which ones you want to download. Check All, and you can uncheck the files you don’t want to download manually.
-
Once you click OK, it will add all the links to your download queue and begin downloading them. It may freeze after about a hundred links or so; it has not crashed, just be patient.
Thanks for reading! If you have any issues, reply to this post or PM me and I’ll try my best to help. Now get hoarding!
That's a nice trick.
I've been using Internet Download Manager for a while now (fantastic, and very versatile tool for Windows), and in the past, I'd just copy and paste links into a txt document manually. I never knew about anything like Linkclump; this would definitely help.
AAAAAAAAA.
I desperately needed this just a few weeks ago, for a small side-project. What's done is done, though. Thanks. :)
I tried different ways to download a site and finally I found the wayback machine downloader - which was built by Hartator (so all credits go to him, please), but I simply did not notice his comment to the question. To save you time, I decided to add the wayback_machine_downloader gem as a separate answer here.
The site at http://www.archiveteam.org/index.php?title=Restoring lists these ways to download from archive.org:
- Wayback Machine Downloader, small tool in Ruby to download any website from the Wayback Machine. Free and open-source. My choice! As of 2025, the original project hadn't received updates for 3 years and stopped working at least for some sites, StrawberryMaster/wayback-machine-downloader is a fork that worked better.
- Warrick - Main site seems down.
- Wayback downloaders - a service that will download your site from the Wayback Machine and even add a plugin for WordPress. Not free.
This can be done using a bash shell script combined with wget.
The idea is to use some of the URL features of the wayback machine:
http://web.archive.org/web/*/http://domain/*will list all saved pages fromhttp://domain/recursively. It can be used to construct an index of pages to download and avoid heuristics to detect links in webpages. For each link, there is also the date of the first version and the last version.http://web.archive.org/web/YYYYMMDDhhmmss*/http://domain/pagewill list all version ofhttp://domain/pagefor year YYYY. Within that page, specific links to versions can be found (with exact timestamp)http://web.archive.org/web/YYYYMMDDhhmmssid_/http://domain/pagewill return the unmodified pagehttp://domain/pageat the given timestamp. Notice the id_ token.
These are the basics to build a script to download everything from a given domain.
So usually at archive org i used to borrow a book for 14 days, there for a download link is availble to download the book in pdf or epub using adobe digital edition...
But i recently came up with a second type of books that only can be borrowed for 1 hour... these types dont have a download link... there for the only way to download these books isa by downloading each page as an image when previewing the book each page as image... so i tried bulk media downloader extension, which catch the images or pages and download them, unfortunetly there is a kind of lock on the download, so the download is failing...
The only way to download them is by loading each 10 pages and typing Ctrl+S which saves the entire page on desktop and therefore get the images i need, or pages i need... it is veeeeery time consuming, cz i cannot load more then 10 pages... and i have to download the whole page for each 2 pages.... so you get my point
Is there any other way i can easily download the book or pages without going through the whole prosess
Thx in advanced
Update: Hello guys, you can check on this guide i made on how to download books from archive, it is a complete guide
https://www.reddit.com/r/Piracy/comments/l9exis/how_to_download_books_from_archive_org_and_how_to/?utm_medium=android_app&utm_source=share