Avoid Robots Indexing

Created Jun 18, 2026Claude

Keep search engine crawlers off your documentation site using the noindex setting.

When you are running a staging site, an internal documentation portal, or a work-in-progress docs build, you likely do not want search engines to index or crawl those pages. The noindex setting in zudo-doc gives you a single switch to block all crawlers.

What the setting does

Setting noindex: true in src/config/settings.ts triggers two complementary behaviors at build time:

Robots meta tag — every rendered page receives:
```
<meta name="robots" content="noindex, nofollow">
```
This instructs crawlers that discover a page (through a link, for example) not to index it and not to follow outbound links.
robots.txt disallow — the generated robots.txt at the site root contains:
```
User-agent: *
Disallow: /
```
This tells crawlers not to crawl any path under the site at all. When noindex is enabled, the Sitemap: line is intentionally omitted from robots.txt — advertising a sitemap while simultaneously disallowing all crawling is contradictory.

Enabling the setting

Via the preset generator (recommended for new projects)

When scaffolding a new project with create-zudo-doc, check the "Avoid robots indexing" checkbox in the interactive Setup Preset Generator. This emits noindex: true in the generated settings.ts.

Via the CLI flag

Pass --noindex to the create-zudo-doc CLI:

npx create-zudo-doc@latest --noindex

Directly in settings.ts

Open src/config/settings.ts and set:

export const settings = {
  // ...
  noindex: true,
};

The default is false, meaning crawlers are allowed.

Caveats

Existing indexed pages are not removed automatically

Enabling noindex prevents future indexing but does not retroactively remove pages already indexed by search engines. To remove those, you must submit removal requests to each search engine's tooling (for example, Google Search Console's URL removal tool).

Crawl directives vs. index directives

robots.txt Disallow: / is a crawl directive — it tells crawlers not to fetch pages. The noindex meta tag is an index directive — it tells a crawler that has fetched a page not to add it to the index.

These two directives are complementary: Disallow reduces crawl budget exposure, while noindex handles any pages a crawler has already retrieved via a direct link. Using them together is the standard approach for a complete "keep everything out" toggle.

One subtle consequence: if a crawler respects robots.txt and never fetches a page, it never sees the noindex meta tag inside that page. In practice this is fine — a crawler that obeys Disallow: / will not index the page anyway. For crawlers that ignore robots.txt, the meta tag remains the safety net.

robots.txt omits the Sitemap line when noindex is enabled

When noindex: true, zudo-doc intentionally omits the Sitemap: entry from robots.txt. Pointing crawlers at a sitemap while also blocking all crawling sends contradictory signals. If you later re-enable indexing, the Sitemap: line is restored automatically.

Avoid Robots Indexing

What the setting does

Enabling the setting

Via the preset generator (recommended for new projects)

Via the CLI flag

Directly in settings.ts

Caveats

See also

Revision History

AI Assistant