Nginx + CDN + GoogleBot or how to avoid many useless Googlebot hits

If you're like me and you've developed a CDN distribution for your website's content (while waiting for SPDY to be widely adopted and available in mainstream distributions), you might have noted that the Googlebot is frequently scanning your CDNs, and this might have made your website a bit overloaded. After all, the goal of the CDNs are (several but in my case only) to elegantly distribute contents across subdomains so your browser will load the page resources faster (otherwise it gets blocked by the HTTP limit or any higher limit set by your browser of simultaneous content download). Hell,