X-Forwarded-For sanitization in Caddy
When reverse proxies like Cloudflare proxy a request, they communicate the original request’s source using the X-Forwarded-For header. This can be dangerous: they will augment a value from the original request producing a value like Spoofed, Cloudflare-Given.
When adding another reverse-proxy like Caddy into the mix you send upstream a value like Spoofed, Cloudflare-Given, Caddy-Given. This is messy; I don’t want to teach upstream what the history of the request is and which proxies are valid.
The Caddy docs on reverse_proxy indicate a way to fix this is by creating transformation rules within Cloudflare to strip the Spoofed value in certain situations. This requires too much work and remembering to it when configuring a new site.
A tactical way to accomplish this is to use the semantics of the header value itself. Our downstream, trusted proxies are either setting the value to IP or appending , IP so we can disregard the earlier parts of the header value entirely. In a Caddyfile, this is done using the request_header directive:
# Remove older X-Forwarded-For values since we only want to
# trust the last one iif we're going to trust at all.
# Regex information: https://regex101.com/r/KZknzS
request_header X-Forwarded-For "^([^,]*,)*\s*([^,]+)$" "$2"
# Later on in the server configuration somewhere
reverse_proxy upstream:1234 {
  # Which proxies allow X-Forwarded-For to go upstream
  trusted_proxies 198.51.100.0/24
}
When a direct request to Caddy comes in with a fake X-Forwarded-For header, it’s not passed upstream because it’s not a trusted proxy. This is the normal behavior.
When a request comes in via a trusted proxy with an X-Forwarded-For value like Spoofed, Cloudflare-Given we pass to upstream Cloudflare-Given, Caddy-Given without the Spoofed part.
