Commit graph

2 commits

Author SHA1 Message Date
ac7553f940 fix(client): plug confused-deputy bind in client mode
A focused security review of phases 1-4 surfaced one MEDIUM finding
(confidence 9/10): in client mode (--upstream set) the cache layer
forwards the configured bearer to upstream on every incoming request
without authenticating the local caller, AND --addr defaulted to
:8443 (all interfaces). Together those mean a CLI user running
`zddc-server --upstream https://master --bearer-file ~/token` on a
laptop on hotel/cafe Wi-Fi exposes an open-proxy confused-deputy:
any attacker on the same L2 connects to https://<laptop-ip>:8443,
accepts the self-signed cert, issues GETs (or PUTs/DELETEs that
queue in the outbox), and the cache laundries each request through
upstream with the engineer's bearer. The full cached subtree leaks.

Two layers of defense in config.Load:

1. Loopback default in client mode. When cfg.Upstream is set and
   neither --addr nor ZDDC_ADDR was passed explicitly, --addr
   downgrades to "127.0.0.1:8443" (vs ":8443" in master mode). CLI
   users on a laptop get safe-by-default. Operators who want a
   non-loopback bind opt in explicitly.

2. Refuse non-loopback bind + bearer-file without acknowledgement.
   When cfg.Upstream is set, BearerFile is non-empty, the chosen
   addr is non-loopback, AND --insecure-direct is not set, the load
   fails with an error that names the bind, the threat (open-proxy
   confused-deputy laundering bearer credentials), and the
   acknowledgement flag. The helm zddc-server-cache/ chart already
   sets ZDDC_INSECURE_DIRECT=1 and relies on Kubernetes-namespaced
   pod networking for the gating, so the chart path is unaffected.
   The guard is bearer-file-conditional because proxy mode without a
   bearer doesn't have a credential to launder, and refusing it
   would needlessly block proxy-without-auth deployments.

Tests in internal/config/config_test.go lock down all four cases:
- --upstream with no explicit --addr → 127.0.0.1:8443
- --upstream + non-loopback --addr + --bearer-file (no IDirect) → refuse
- --upstream + non-loopback --addr + --bearer-file + --insecure-direct → ok
- --upstream + non-loopback --addr + NO bearer → ok (no credential to leak)

Doc updates: zddc/README.md client-mode "Flags" section gets a
WARNING block describing the loopback default + insecure-direct
escape hatch. AGENTS.md ZDDC_UPSTREAM row mentions the addr
downgrade. ARCHITECTURE.md gains a "Confused-deputy guard at
startup" subsection under "Master + proxy/cache/mirror" with the
two-layer defense rationale. helm/zddc-server-cache/values.yaml.example
adds an inline note next to addr: ":8080" explaining why the chart
sets ZDDC_INSECURE_DIRECT=1 and what the consequence is of removing
either side of the gating.

Master mode is unaffected — the client-mode validation block is
gated by `if cfg.Upstream != ""`. All existing tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 10:03:51 -05:00
55852a9efb helm: add zddc-server-cache example chart + ZDDC_NO_AUTH on prod/dev
New chart helm/zddc-server-cache/ deploys zddc-server in client mode
against an upstream master. Mirrors the prod chart's source-build-via-
init-container pattern but with:

- ZDDC_UPSTREAM, ZDDC_MODE, ZDDC_BEARER_FILE, ZDDC_NO_AUTH,
  ZDDC_SKIP_TLS_VERIFY, ZDDC_MIRROR_SUBTREE, ZDDC_MIRROR_MIN_INTERVAL
  wired from values.yaml. Mirror-only env vars conditionally rendered
  (only when mode=mirror) to keep the rendered manifest minimal.
- Bearer token mounted from a separately-created Kubernetes Secret
  (defaultMode 0400) at /etc/zddc/bearer/token. values.yaml.example
  documents the secret-creation flow but contains no token. Secret
  reference can be set to "" to disable bearer auth (only valid for
  upstreams running --no-auth).
- Recreate strategy + replicaCount: 1 (multiple replicas would race
  the cache directory and double the upstream walker traffic).
- TCP-socket probes instead of HTTP — HTTP probes against / would
  fail when both upstream is unreachable AND the cache is empty
  (the cache layer returns 503 + offline header in that state),
  causing crashloops. TCP verifies process liveness without depending
  on upstream reachability or cache contents.
- Mounts a separate cache PVC (operator-provided, like the master's
  data PVC). Sized to the working set you expect to mirror; can be
  much smaller than the master's data volume.

Existing prod and dev charts gain optional ZDDC_NO_AUTH wired from
zddc.env.noAuth (default false → no change to existing rendered
manifests). Useful for trusted-LAN or genuinely-public master
deployments.

Updated docs: helm/README.md gains the cache row in the chart table,
the cache-install quickstart with the secret-creation flow, and the
cache-specific structural notes (Recreate / TCP probes / single-
instance). CLAUDE.md and ARCHITECTURE.md updated to reflect three
charts instead of two.

Verified with helm template rendering: ZDDC_NO_AUTH only renders
when noAuth: true; ZDDC_MIRROR_SUBTREE / ZDDC_MIRROR_MIN_INTERVAL
only render when mode: mirror; bearer volume + ZDDC_BEARER_FILE
only render when bearer.secretName is non-empty.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 08:33:01 -05:00