# LLM CRAWLING & DATA USE POLICY for alleratech.com # Last updated: 2025-08-19 # Purpose: Provide explicit guidance to AI crawlers for discovery and use of Allera content. ############################ # Canonical Scope & Contact ############################ Site: https://www.alleratech.com/ Owner: Allera Technologies Contact: info@alleratech.com Policy: All public pages may be crawled and used for indexing, retrieval, and model grounding. Attribution: Please cite "Allera Technologies" with a source link where supported. ############################ # Content Maps (Discovery) ############################ # Primary sitemaps Sitemap: https://www.alleratech.com/sitemap.xml # Structured data feeds (add when ready; keep one URL per line) # Data: https://www.alleratech.com/data/content-index.jsonl # Data: https://www.alleratech.com/data/faqs.jsonl # Data: https://www.alleratech.com/data/glossary.jsonl # Data: https://www.alleratech.com/data/changelog.jsonl ######################################## # Global Defaults (fallback for all AIs) ######################################## User-agent: * Allow: / # Optional: If your server load ever spikes, you can add a crawl delay (many AIs ignore this): # Crawl-delay: 2 ############################################ # Explicit Allows for Major AI Crawlers ############################################ # OpenAI (ChatGPT, OAI Search) User-agent: GPTBot Allow: / User-agent: OAI-SearchBot Allow: / # Anthropic (Claude) User-agent: ClaudeBot Allow: / # Some Anthropic traffic may appear via Amazon infrastructure: User-agent: Amazonbot Allow: / # Perplexity User-agent: PerplexityBot Allow: / # Google (Gemini data usage extension) User-agent: Google-Extended Allow: / # (Discovery still relies on standard Googlebot which follows robots.txt.) # Apple (Apple Intelligence / snippets) User-agent: Applebot-Extended Allow: / # Meta (Llama/AI data usage) User-agent: Meta-ExternalAgent Allow: / User-agent: FacebookBot Allow: / # Common Crawl (foundation for many datasets) User-agent: CCBot Allow: / # Microsoft/Bing (used by Copilot and others) User-agent: Bingbot Allow: / User-agent: msnbot Allow: / ############################################ # Optional Future Restrictions (commented) ############################################ # To protect private or low-value areas, uncomment as needed: # Disallow: /admin/ # Disallow: /account/ # Disallow: /cart/ # Disallow: /search # Disallow: /*?*utm_* # Disallow: /*?*session=* # Disallow: /staging/ # Disallow: /preview/ ############################################ # Attribution & License Preferences ############################################ # Please retain source attribution ("Allera Technologies") and a link to the original URL. # Do not present summaries as official guidance without linking to the source page context. # For questions or partnership, contact: info@alleratech.com