How to Block AI Crawlers from Your Website | CrawlReady AI
Learn when to block GPTBot, ClaudeBot, PerplexityBot, and Google-Extended in robots.txt — without breaking Google search crawling.
Some guides may be AI-assisted and are always human-reviewed for accuracy before publish. See our Google generative AI search guide and Google's AI content guidance.
Blocking AI crawlers is a business decision — not a technical trick. robots.txt publishes your crawl policy to the public internet. Well-behaved bots honor it; malicious scrapers may not.
Which AI crawlers can you block?
- GPTBot — OpenAI training crawler
- ClaudeBot / Claude-Web — Anthropic
- PerplexityBot — Perplexity AI answers
- Google-Extended — Google generative AI training opt-out token
- CCBot — Common Crawl open corpus
Example: block all AI crawlers
User-agent: GPTBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: PerplexityBot
Disallow: /
User-agent: *
Allow: /
Generate rules for your site
Use the AI Crawler Robots.txt Rules Generator to pick Block all AI crawlers, Block only GPTBot, or a custom starter file. Upload the result as robots.txt at your domain root.
After publishing, verify access with the GPTBot checker, ClaudeBot checker, or a full AI crawler scan.
Read the full reference in robots.txt for AI crawlers.
Frequently Asked Questions
Will blocking GPTBot hurt Google rankings?
No. GPTBot is separate from Googlebot. Blocking GPTBot only affects OpenAI training crawls, not Google Search indexing.
Should I block all AI crawlers or only some?
Many publishers block training crawlers (GPTBot, CCBot, Google-Extended) but allow search crawlers (OAI-SearchBot, Googlebot) so AI answers can still cite their pages.
Important disclaimer
This guide is for educational purposes only. No tool or technique guarantees search rankings, AI inclusion, or specific traffic results. Refer to official documentation from search engines and AI providers for current policies.