Twiceler is the crawler for our new search engine. It is important
to us that it obey robots.txt, and that it not crawl sites that do not
wish to be crawled.
Recently we have seen a number of crawlers masquerading as Twiceler, so
please check that the IP address of the crawler in question is one of ours.
You can see our IP addresses at http://www.cuil.com/info/webmaster_info
You may wish to add a robots.txt file to your site (I notice you don’t
have one). That is the standard mechanism for controlling robot access and
behavior. You can read about it at
and there a simple generator of the file here
The Crawl-delay directive is what you are looking for. It tells robots
that support it (we do) how long to wait between requests. Add the
directive just below the ‘User-agent: *’ line like this:
would tell us to wait two minutes between requests.
Also be aware that changes to robots.txt take several days to take
effect. The industry standard is to cache robots.txt for seven days,
but we make every effort to re-read it more frequently.
Please feel free to contact me if you have any further questions.