Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As of version 1.14, wget natively supports warc (including built-in gzip and cdx index file generation).

http://www.archiveteam.org/index.php?title=Wget_with_WARC_ou...

This makes creating a browse-able mirror of a site in warc format fairly straightforward, as wget will automatically make links relative, as well as fetch requisite files (css, js, images) for each page.



Yeah, but as far as I can guess, derwiki's service doesn't use wget, so running a proxy to store the WARCs is the next-simplest thing.


If his service runs on any sort of Linux distro, its stupid simple to call wget with a system call. Wget comes standard with all of the most popular distros.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: