Hugo SEO: noindex on taxonomies to fix Bing's "too many thin pages" warning

2026-06-29 540 words 3 minutes

/images/noindex-taxonomies-hugo-bing-seo-featured.jpg

Contents

Context

Bing Webmaster Tools was raising a “There are too many pages with insufficient content” recommendation (severity: moderate) — 15 pages flagged, all of the same type:

https://www.arleo.eu/en/tags/svg/
https://www.arleo.eu/en/tags/sonarr/
https://www.arleo.eu/en/categories/incidents/
etc.

These are Hugo taxonomy pages — tag and category listing pages. They contain no article content of their own, just a list of links. From Bing’s perspective, that’s thin content.

Chosen strategy

Three options were available:

Block via robots.txt → discouraged: prevents crawling, hides internal links
Enrich taxonomy page content → too costly for a technical blog
noindex,follow → the right answer: tell robots not to index the page, but still follow links to articles

Option 3 is the correct call. The follow preserves internal linking — robots continue discovering articles through these pages.

flowchart TD A[Hugo page request] --> B{.Params.robots set?} B -- yes --> C[Use front matter value] B -- no --> D{.Kind = taxonomy or term?} D -- yes --> E[noindex,follow] D -- no --> F[index, follow] C --> G[meta name=robots] E --> G F --> G

Implementation

Hugo concept: `.Kind`

Hugo classifies every page by its kind:

Kind	Example URL
`home`	`/`
`page`	`/posts/my-article/`
`section`	`/posts/`
`taxonomy`	`/tags/`, `/categories/`
`term`	`/tags/hugo/`, `/categories/homelab/`

The pages to handle are .Kind == "taxonomy" and .Kind == "term".

Change: `layouts/baseof.html`

The site uses a custom baseof.html that already overrides the LoveIt theme. It’s the natural place to centralise robots logic.

Before:

<meta name="robots" content="index, follow" />

After:

{{- if .Params.robots -}}
<meta name="robots" content="{{ .Params.robots }}" />
{{- else if or (eq .Kind "taxonomy") (eq .Kind "term") -}}
<meta name="robots" content="noindex,follow" />
{{- else -}}
<meta name="robots" content="index, follow" />
{{- end -}}

A three-level cascade:

Front matter robots: — absolute priority for future one-off exceptions
Taxonomies — automatic noindex,follow
Everything else — index, follow

One file changed, no duplication, no taxonomy layout overrides needed.

Removing taxonomies from the sitemap

Having noindex pages in a sitemap is inconsistent — you’re telling robots not to index a page while explicitly pointing them to it. Best practice is to remove them.

The LoveIt sitemap includes all pages. A minimal override in layouts/sitemap.xml:

{{- $excludedKinds := slice "taxonomy" "term" -}}
{{- range (where .Data.Pages "Section" "!=" "gallery") -}}
    {{- if not (in $excludedKinds .Kind) -}}
    <url>
        <loc>{{- .Permalink -}}</loc>
        ...
    </url>
    {{- end -}}
{{- end -}}

Result: the EN sitemap drops from 117 to 40 URLs (77 taxonomy entries removed).

Verification

After rebuild:

# Taxonomy pages → noindex,follow
curl -s https://www.arleo.eu/tags/ | grep -o 'name=robots[^>]*>'
# → name=robots content="noindex,follow">

curl -s https://www.arleo.eu/en/tags/sonarr/ | grep -o 'name=robots[^>]*>'
# → name=robots content="noindex,follow">

# Articles → index, follow
curl -s https://www.arleo.eu/en/posts/debug-seo-404-broken-links/ | grep -o 'name=robots[^>]*>'
# → name=robots content="index, follow">

# Clean sitemap
curl -s https://www.arleo.eu/en/sitemap.xml | grep -o '<loc>[^<]*</loc>' | grep 'tags\|categories' | wc -l
# → 0

Modified files

File	Role
`layouts/baseof.html`	Conditional robots logic
`layouts/sitemap.xml`	Sitemap override — excludes taxonomy/term

robots.txt was not touched. Taxonomy URLs remain accessible and crawlable — only their indexing is disabled.

On the Bing side

The fix is immediate on the technical side. Bing will take a few days to re-crawl the pages and update the recommendation. To speed things up: in Bing Webmaster Tools → Recommendations → manually validate or submit a sitemap recrawl request.