Your robots.txt file is the single most impactful GEO fix. If AI crawlers are blocked, your site is invisible to ChatGPT, Perplexity, and every other AI engine. This 5-minute change can unlock AI citations. If you want to go deeper, Page Speed & AI Crawlers: Does It Matter? breaks this down step by step.
Why robots.txt Matters for GEO
AI engines use crawler bots to read your website. These bots check robots.txt before accessing any page. A single Disallow rule can make your entire site invisible to AI search. (We explore this further in GEO for Professional Services (2026).)
Unlike Google, which may still index blocked pages through links, AI crawlers strictly obey robots.txt. No access means zero citations. This relates closely to what we cover in Core Web Vitals Explained: LCP, INP, and CLS for SEO in 2026.
AI Crawler Bots You Must Allow
| Bot | Engine | User-Agent |
|---|---|---|
| GPTBot | ChatGPT (training) | GPTBot |
| ChatGPT-User | ChatGPT (browsing) | ChatGPT-User |
| PerplexityBot | Perplexity | PerplexityBot |
| Google-Extended | Google AI / Gemini | Google-Extended |
| ClaudeBot | Claude / Anthropic | ClaudeBot |
| Amazonbot | Alexa / Amazon | Amazonbot |
| Applebot-Extended | Apple Intelligence | Applebot-Extended |
| Meta-ExternalAgent | Meta AI | Meta-ExternalAgent |
| cohere-ai | Cohere | cohere-ai |
| Bytespider | TikTok AI | Bytespider |
Recommended robots.txt Template
Copy this directly into your robots.txt file: For more on this, see our guide to GEO for Agencies: AI Search as a Service.
## Allow all AI crawlers for GEO
User-agent: GPTBot
Allow: /
User-agent: ChatGPT-User
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Google-Extended
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: Amazonbot
Allow: /
User-agent: Applebot-Extended
Allow: /
User-agent: Meta-ExternalAgent
Allow: /
User-agent: cohere-ai
Allow: /
## Standard search engines
User-agent: Googlebot
Allow: /
User-agent: Bingbot
Allow: /
## Sitemap
Sitemap: https://yourdomain.com/[XML sitemap](/blog/xml-sitemap-ai-crawlers).xml
How to Check Your Current Setup
Visit yourdomain.com/robots.txt in your browser. Look for any Disallow rules targeting AI bot user-agents. Common problematic blocks: Our What Is Answer Engine Optimization (AEO)? Complete Guide guide covers this in detail.
## REMOVE THESE — they block AI visibility
User-agent: GPTBot
Disallow: /
User-agent: *
Disallow: /
The wildcard User-agent: * with Disallow: / blocks everything including all AI crawlers. As we discuss in Free GEO Audit Tools for AI Visibility, this is a critical factor.
Selective Access
If you need to protect certain areas while allowing AI access to content: If you want to go deeper, Why JavaScript Kills Your AI Visibility breaks this down step by step.
User-agent: GPTBot
Allow: /blog/
Allow: /products/
Allow: /about/
Disallow: /admin/
Disallow: /api/
Disallow: /private/
Common Mistakes
- CMS defaults — WordPress and Next.js sometimes generate restrictive robots.txt files
- CDN blocking — Cloudflare and other CDNs may block bot traffic by default in security settings
- Forgetting ChatGPT-User — GPTBot handles training data, ChatGPT-User handles live browsing. You need both
- Rate limiting — Some server configs throttle bot requests too aggressively
How Quickly Do Changes Take Effect?
AI crawlers typically discover robots.txt changes within 24-48 hours. Citation improvements usually appear within 1-2 weeks after crawlers re-index your content. (We explore this further in How to Write Answer Units — Paragraphs AI Can Quote.)
FAQ
Will allowing AI crawlers hurt my SEO?
No. AI crawler access has no impact on Google rankings. These are separate systems.
Should I block any AI crawlers?
Only if you have specific licensing or legal concerns. For maximum AI visibility, allow all crawlers.
Do I need to restart my server after changing robots.txt?
No. robots.txt is a static file served by your web server. Changes are immediate.