how to copy a website with httrack on linux

This is more for my own reference than anything. Say you see a flog on the intertubes and want to rip it and stick up for affiliate links. How to do it quickly on Linux? I used to use wget but it sucked. httrack is much better.

httrack "http://www.techcrunh.com/" -N1 -O "/home/techcrunch_rip/public_html" +techcrunch.com/* +crunchgear.com/* -v

This will rip the homepage of techcrunch and stick it in the folder specified by -O. URL filters next ensure it only downloads files from certain domains. The -N1 argument is the most important, it ensures htttrack sticks all images, css in one directory instead of creating loads of directories. Very handy.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title="" rel=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>