Tuesday, November 6, 2012

Seach Engines Causing Load Problems on Web Server


Search engines need to crawl websites regularly to index new content and check for changes to existing content.  The frequency search engine bots spend on a website can cause the server to experience load problems. Most major search engines allow the ability to slow the crawl rate.
Small websites which do not update their content regularly will likely never need to slow the crawl rate.  Bots will adjust the crawl rate to an appropriate frequency based on the content they find.
For large websites with many pages updated more frequently, search engines may attempt to visit the site often and crawl deeper.
The crawl-rate can be set inside of the robots.txt file as shown (in seconds):
User-agent: *Crawl-delay: 10

But, wait, Google doesn't support this.  You have to adjust your settings in Google Webmaster Tools.  And, they have the nerve to say they can ignore this setting whenever they want and do it anyway...