Skip to content

FR: Support for llms.txt #504

@Golembivskyi

Description

@Golembivskyi

Hello Ether Creative team!

I'm a big fan of your SEO plugin — it's been a game-changer for managing meta tags, sitemaps, and redirects on my Craft CMS sites. The dynamic robots.txt rendering from a Twig template is especially handy, and I've been using it to keep things flexible without hardcoding paths.

Feature Request: Support for llms.txt

With the rise of AI crawlers (like GPTBot, Claude, Gemini, and PerplexityBot) in 2025, there's a new standard emerging: llms.txt (see llms.txt.org). It's essentially a robots.txt-like file specifically for AI/LLM bots, allowing site owners to control what gets indexed for training data or responses. It uses the same syntax (User-agent, Allow, Disallow, etc.) but focuses on AI-specific policies, like attribution requirements or quoting limits.

Right now, users like me have to manually create and maintain a static llms.txt file in the root (or /.well-known/llms.txt), which feels outdated in a CMS like Craft. Since your plugin already handles dynamic robots.txt via Twig templates, it would be amazing to extend this to llms.txt!

Proposed Implementation

  • Dynamic Rendering: Add a new setting in SEO → Settings (e.g., "LLMs" section) where users can configure a Twig template path for llms.txt, similar to the existing Robots setting.
    • Default template: Something like seo/_seo/llms (with access to seo object for variables like site URLs, disallowed paths, etc.).
    • Serve it at /llms.txt (and optionally /.well-known/llms.txt for better compatibility).
  • Built-in Defaults: Include a sample template with common directives, e.g.:
    User-agent: *
    Allow: /
    Disallow: /admin/

This could pull from Craft globals or SEO settings (e.g., exclude paths from sitemap exclusions).

  • Integration Ideas:
  • Link it to existing features: Auto-disallow paths from Redirects or Sitemap exclusions.
  • Add a simple policy editor in the field type for quick attribution rules (e.g., "Require link to golem.agency").
  • Support for multi-environment: Like robots.txt, render differently for dev/staging (e.g., full Disallow in non-production).

Why This Matters

  • SEO for AI Era: More traffic comes from AI responses (ChatGPT, Gemini searches). Controlling llms.txt ensures content is indexed ethically — e.g., forcing attribution back to our sites, which boosts brand mentions.
  • Craft Community Fit: Plugins like SEOmatic already touch on robots.txt; this would position Ether SEO as forward-thinking for 2025+.
  • Low Effort, High Value: It's a small extension of your existing robots.txt logic — mostly routing and templating.

I'd be happy to test a beta or contribute a PR if you point me in the right direction! What do you think?

Thanks for the great work — keep crafting! 🚀

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions