Elad Hen is a user on mastodon.social. You can follow them or interact with them if you have an account anywhere in the fediverse. If you don't, you can sign up here.
Elad Hen @eladhen

A friend of mine is looking to archive a (vile) website that is about to be taken down. She needs it for advocacy for the people the site is against. What program can download as much of the site as possible?

@hhardy01
What parameters do I need for it? I don't need just one page, but the entire directory tree or whatever it's called...

@eladhen Archive.org will archive it if you submit it to them, but it's ... if it's a shitty website, maybe you don't want that.

dheinemann.com/2011/archiving-

That's how to do it with wget. That'll get pretty much all of it.

@ajroach42
It is vile beyond words. I rather not.

@ajroach42
Thanks for the link. I'll try that.

@ajroach42
The parameters there for wget seem to do the trick!

@eladhen Use wget. Here's a page with Windows binaries: eternallybored.org/misc/wget/

And here's an old Linux Jounal article on how to download entire websites with wget.

linuxjournal.com/content/downl

@starbreaker
I'm on Linux. No windows machine for windows binaries.

@eladhen Then, depending on your distro, you might already have wget installed. If not, you probably already know how to get it. :)

@starbreaker
I've got wget and have used it in the past for simpler things.

@eladhen my advice would be to point the Wayback Machine at it

@eladhen archive.org/web/

Basically, Archive.org keeps a permanent record of many sites on the Internet, and you can send them a URL and they'll add the site to their list.

However, if what you need is a total scrape of the site (all pages), you will be better off using httrack.com/ or a cURL script.

I suggest Archive.org mainly because 1) it's a well-known and highly trusted organization, reducing possible doubt about the accuracy of the backup and 2) it's easy.

@tindall
The fact that this site is taken down is a blessing. I don't want it to be publicly available.

@eladhen @starbreaker The Archive Team might be able to help. They have scripts and stuff. As well as wget, there's wpull which is designed for downloading whole sites. grab-site is wpull with a nicer interface. There's lots of stuff on archiving websites if you start there

@ebel
Yeah, the problem is the takedown is tomorrow, I'm trying to find something that's ready to go. Not much time to lose on research...
@starbreaker