Spider a website with wget

This command might be useful if you want to auto-generate the Boost module cache files on a Drupal site

wget -r -l4 --spider -D thesite.com http://www.thesite.com

Let's analyse the options...

-r indicates it's recursive (so "follow the links" and look for more than one page)

-l indicates the number of levels we want to recurse. If you are on the first page and you follow a link, you are at level 1. If you follow a link on that last page, you are at level 2, etc

--spider indicates not to download anything (we just want to go through the pages, that's all)