CrowdSec Log Pipeline with Vector: Filtering Noise and Capturing Real Bans

โก In short
The initial Vector pipeline was flooding BetterStack with ~500 events/24h, of which 434 were CAPI pulls with no local monitoring value. This work reconfigures the Vector filter to keep only high-value bans (cscli) and fixes a major blind spot: actual nginx-lua bouncer bans were not appearing anywhere in BetterStack.
๐ง Why
This homelab’s security stack relies on three components working together:
- nginx with the CrowdSec lua bouncer (
lua-resty-crowdsec) for real-time request blocking - CrowdSec for threat detection and ban decision management
- Vector centralizing logs to BetterStack for monitoring
After setting up the initial pipeline, two problems quickly became apparent. First, the signal was drowned in noise: out of 500 events/24h, 434 came from the hourly community CAPI pull and 66 from third-party lists โ neither represents a threat detected on this infrastructure. Second, actual lua bouncer bans (real-time blocks in nginx) were not appearing anywhere in BetterStack, creating a blind spot on real security activity.
๐ง What was done
Problem 1: CAPI and third-party list noise
Over a 24-hour period, the distribution of CrowdSec events in BetterStack was:
| Origin | Count | Nature |
|---|---|---|
CAPI | 434 | Community pull every hour |
lists | 66 | Third-party lists (firehol_greensnow, otx-webscannersโฆ) |
cscli | 0 | Local manual bans โ never seen |
CAPI and lists events arrive in bursts every hour on the dot (at :09 each hour), corresponding to the community list sync cycle. Solution: modify the Vector filter to keep only origin == "cscli":
# In vector.yaml
crowdsec_decisions_filter:
type: "filter"
inputs:
- "crowdsec_decisions_flatten"
condition: |
exists(.cs) && .cs.origin == "cscli"Problem 2: Effective lua bouncer bans invisible in BetterStack
The nginx lua bouncer blocks IPs in real time, but these actual blocks were not appearing anywhere in BetterStack. Yet they are logged by nginx in /var/log/nginx/error.log:
2026/04/15 03:36:37 [alert] 67913#67913: *3949 [lua] crowdsec.lua:783: Allow(): \
[Crowdsec] denied '43.130.106.18' with 'ban' (by bouncer), \
client: 43.130.106.18, server: www.arleo.eu, \
request: "GET / HTTP/2.0", host: "www.arleo.eu"The issue came from the existing nginx filter in Vector, which silently dropped any message containing the word crowdsec. Since error.log is already included in the nginx source, there is no need to create a new source โ the solution is to insert a transform before the filter to tag and reroute these events.
New pipeline architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโ
โ nginx error.log โ
โโโโโโโโโโโโฌโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ better_stack_nginx_parser โ
โ (parses all nginx logs) โ
โโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ crowdsec_nginx_ban_extractor โ
โ (detects [Crowdsec] denied) โ
โ tags cs_nginx_ban = true/false โ
โโโโโโฌโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโ
โ โ
cs_nginx_ban==true cs_nginx_ban==false
โ โ
โผ โผ
crowdsec_nginx_ better_stack_
ban_filter nginx_filter
โ โ
โผ โผ
CrowdSec nginx
BetterStack sink BetterStack sinkThe extractor transform
crowdsec_nginx_ban_extractor:
type: "remap"
inputs:
- "better_stack_nginx_parser_XXXXX"
source: |
msg = string(.message) ?? ""
if contains(msg, "[Crowdsec] denied") && contains(msg, "with 'ban'") {
m = parse_regex(msg, r'\[Crowdsec\] denied \'(?P<banned_ip>[^\']+)\' with \'ban\'') ?? {}
ip = string(m.banned_ip) ?? "?"
req = string(.nginx.request) ?? "-"
host = string(.nginx.host) ?? string(.nginx.server) ?? "-"
.cs_nginx_ban = true
.cs_banned_ip = ip
.cs_origin = "nginx-bouncer"
.platform = "CrowdSec"
.message = "Ban " + ip + " | " + req + " | " + host
del(.file)
del(.level)
del(.nginx.cid)
del(.nginx.pid)
del(.nginx.tid)
} else {
.cs_nginx_ban = false
}The .message field is built to be immediately readable in the BetterStack tail:
Ban 43.130.106.18 | GET / HTTP/2.0 | www.arleo.euModifying the existing nginx filter
Add the exclusion condition for already-rerouted bans:
better_stack_nginx_filter_XXXXX:
type: "filter"
inputs:
- "crowdsec_nginx_ban_extractor" # โ now points to the new transform
condition: |
!contains(string(.message) ?? "", "crowdsec") &&
!contains(string(.message) ?? "", "Initialisation done") &&
!contains(string(.message) ?? "", "APPSEC is enabled") &&
!((.nginx.status == 499) && contains(string(.nginx.path) ?? "", "empty.php")) &&
!contains(string(.message) ?? "", "lua tcp socket read timed out") &&
!(.cs_nginx_ban == true) # โ exclude rerouted bansUnified CrowdSec sink
Both flows (cscli bans and lua bans) converge into the same sink:
crowdsec_betterstack_sink:
type: "http"
inputs:
- "crowdsec_decisions_filter" # cscli bans
- "crowdsec_nginx_ban_filter" # nginx lua bansBonus: A Cloudflare Token Blocked by Its Own Server
Alongside the Vector work, the crowdsec-cf-sync.py script had been silently failing for several days with HTTP 401 Authentication error. The cause: the Cloudflare token had an IP restriction (not_in) that explicitly included the server’s own WAN IP. Every API request sent from the NUC was rejected by Cloudflare.
Fix: remove the server’s IP from the token’s not_in list via the Cloudflare API. The script immediately resumed normal synchronization (13 active bans re-synchronized).
๐ Conclusion
The signal-to-noise ratio went from ~500 events/24h โ mostly irrelevant โ down to only the events that deserve attention: cscli bans (manual decisions or local scenarios) and nginx-bouncer bans (real-time effective blocks with IP, request and vhost). The pipeline now provides an accurate view of the server’s real security activity.
To go further:
- ๐ก Add a nginx-bouncer ban counter in a BetterStack dashboard to visualize blocking spikes in real time
- ๐ก Extend the
csclifilter to also include bans from custom local scenarios (origin == "crowdsec"with IP scope)