From ca00904f1e09ccaa35fb76afab17a44d898d48fd Mon Sep 17 00:00:00 2001 From: ZDDC Date: Fri, 8 May 2026 07:57:14 -0500 Subject: [PATCH] =?UTF-8?q?feat(client):=20cache=20mode=20=E2=80=94=20on-d?= =?UTF-8?q?emand=20fetch=20+=20persist=20+=20offline=20fallback?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit zddc-server can now run as a downstream client of another zddc-server. Set --upstream and the master-side machinery (archive index, apps server, watcher, OPA decider, ACL middleware, token store) is bypassed entirely; cmd/zddc-server/main.go short-circuits to runClient(cfg) which uses zddc/internal/cache/Cache as the entire request handler. Three modes via --mode : - proxy: forward upstream live, no disk persistence - cache (default): persist responses on access; subsequent hits serve from disk + background If-Modified-Since revalidate - mirror: accepted but currently behaves like cache; the access- triggered walker lands in phase 3 Cache directory layout is intentionally a normal ZDDC root: a file fetched from /foo/bar.txt is stored at /foo/bar.txt with no sidecar metadata. The local file's mtime is set to the upstream's Last-Modified header so revalidation reflects the master's notion of file age, not local fetch time. Running zddc-server --root without --upstream serves the cached files as a plain master — useful for portable offline snapshots. A small .zddc-upstream marker is written once on first persist for provenance. Pipeline (GET/HEAD only — writes deferred): - Hit → http.ServeContent serves directly (range-aware, 304-aware) + background revalidate (304 no-op, 200 overwrite, 403/404 purge) - Miss → forward to upstream with the configured bearer; tee response body to client + tmp-file atomically renamed into the cache - Network error + cached → serve stale + X-ZDDC-Cache: offline - Network error + no cache → 503 + X-ZDDC-Cache: offline - Directories always proxy live (no listing cache yet — phase 3) - Cache-Control: no-store / private and non-200 responses bypass cache Range requests work end-to-end (Range/If-Range headers forwarded on miss; http.ServeContent handles them natively on hit). Hop-by-hop headers per RFC 7230 §6.1 are dropped from forwarded responses. New flags (also as ZDDC_* env vars), all ignored when --upstream is empty (so master deployments are untouched): - --upstream - --mode proxy|cache|mirror (default cache) - --bearer-file (0600 file with the master-issued token) - --skip-tls-verify (separate from --no-auth; for self-signed dev) Validation: --upstream must be http(s)://...; trailing / is trimmed. Mode validated to one of the three known values. The startup no-root-.zddc check is skipped in client mode (the cache directory starts empty by design). The plain-HTTP-on-non-loopback check is also skipped (the local instance never reads the email header to decide anything; auth is forwarded to upstream as a Bearer). Tests: zddc/internal/cache/cache_test.go runs httptest.NewServer as the upstream and covers miss-then-hit, proxy-mode-no-persist, directory-never-cached, HEAD-no-body, offline-with-cache, offline-no-cache → 503, bearer forwarding, query-string preservation, no-store bypass, path-traversal rejection, error-status forwarding, revalidate-on-403/404/200/304, range-on-hit, concurrent-same-URL, cache-path boundary cases. 23 new tests, full suite + go vet clean. Live two-instance smoke verified: master at 127.0.0.1:18443, client at :18444 with --mode cache, miss→hit→hit transitions work, file materialises under cache root with parent dirs created, marker file written once, range-on-hit returns 206, master sees background 304s on every hit, killing master leaves cached files serving from disk and never-cached files returning 503 + offline header. Doc updates: zddc/README.md gains a "Client mode" section with the modes table, flag reference, pipeline summary, two-instance recipe, and explicit list of phase-2 limitations; AGENTS.md adds the four new env vars to the reference table and a "Client mode" subsection with smoke-test recipe and a pointer to the cache package; ARCHITECTURE.md adds "Master + proxy/cache/mirror" before "Bearer token issuance," covering the topology, the persist/warm switches, the cache-IS-a-ZDDC-root invariant, the request pipeline, and the v1-out-of-scope multi-tenancy note; CLAUDE.md's zddc/ entry expanded to mention both deployment shapes so future agents pick it up by default. Co-Authored-By: Claude Opus 4.7 (1M context) --- AGENTS.md | 43 +++ ARCHITECTURE.md | 47 +++ CLAUDE.md | 2 +- zddc/README.md | 77 +++++ zddc/cmd/zddc-server/main.go | 98 ++++++ zddc/internal/cache/cache.go | 479 ++++++++++++++++++++++++++ zddc/internal/cache/cache_test.go | 546 ++++++++++++++++++++++++++++++ zddc/internal/config/config.go | 61 +++- 8 files changed, 1350 insertions(+), 3 deletions(-) create mode 100644 zddc/internal/cache/cache.go create mode 100644 zddc/internal/cache/cache_test.go diff --git a/AGENTS.md b/AGENTS.md index b2b6aa0..c206755 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -447,12 +447,55 @@ ZDDC_ROOT=/path/to/your/archive ZDDC_TLS_CERT=none ZDDC_ADDR=:8080 \ | `ZDDC_CORS_ORIGIN` | *(empty)* | Comma-separated CORS allowlist; empty (default) disables CORS — appropriate for embedded-tools deployments where tools and data are same-origin. Set explicitly only for self-hosted tools at a different host (e.g. `https://tools.acme.com`) or the CDN-bootstrap pattern (`https://zddc.varasys.io`). | | `ZDDC_INSECURE` | *(empty)* | Must be `1` to allow startup with no `/.zddc`. Without it, the server refuses to start because no `.zddc` files anywhere → public-by-default. Set only for deliberately-public archives. | | `ZDDC_NO_AUTH` | *(empty)* | `1` skips ACL enforcement entirely on this instance. On a master: anyone reads everything (dev / trusted-LAN read-only deployments). On a downstream proxy/cache/mirror: trust upstream's filtering, don't re-evaluate ACLs locally. **Distinct from `ZDDC_INSECURE`** (which gates a startup safety check). | +| `ZDDC_UPSTREAM` | *(empty)* | Master URL (`https://master.example.com`). When set, the binary runs as a **client** (downstream proxy/cache/mirror) instead of a master — the master-side machinery (archive index, apps server, watcher, OPA, ACL middleware, token store) is replaced by the cache layer in `zddc/internal/cache/`. `--root` becomes the cache directory. | +| `ZDDC_MODE` | `cache` | Client mode: `proxy` (forward live, no persistence), `cache` (default; persist responses on access), `mirror` (phase 3 — currently behaves like `cache`). Ignored when `ZDDC_UPSTREAM` is empty. | +| `ZDDC_BEARER_FILE` | *(empty)* | Path to a 0600 file containing the master-issued token (see `/.tokens` on the master). Forwarded as `Authorization: Bearer …` to upstream on every request. Ignored when `ZDDC_UPSTREAM` is empty. | +| `ZDDC_SKIP_TLS_VERIFY` | *(empty)* | `1` accepts self-signed / untrusted upstream certs. Distinct from `ZDDC_NO_AUTH`. Dev / internal-CA scenarios only. | | `ZDDC_OPA_URL` | `internal` | Policy decider endpoint. `internal` (default) = in-process Go evaluator (same `.zddc` cascade we always had). `http(s)://...` or `unix:///...` = external OPA — every access decision becomes a `POST /v1/data/zddc/access/allow` to the configured endpoint. Federal customers with their own audited Rego use this; commercial deployments leave it `internal`. | | `ZDDC_OPA_FAIL_OPEN` | *(empty)* | External OPA only. `1` = allow on transport error; default = fail closed (deny). | | `ZDDC_OPA_CACHE_TTL` | `1s` | External OPA only. Per-decision cache TTL — amortizes round-trips on bursty patterns (e.g. `.archive` listings hit the same `(email, dir)` tuple many times). `0` disables. Format is Go `time.ParseDuration`. | | `ZDDC_APPS_PUBKEY` | *(empty)* | Path to PEM Ed25519 pubkey for verifying signatures on URL-fetched `apps:` artifacts. Empty = URL apps refused. Download from `zddc.varasys.io/pubkey.pem` (canonical channels) or supply your own. No baked-in default — same posture as TLS certs. Alternative inline form: `apps_pubkey:` in root `.zddc` (root-only, env/flag wins). | | `ZDDC_ACCESS_LOG` | `/.zddc.d/logs/access-.log` | JSON-line audit log (lumberjack-rotated, 100 MB / 10 backups / 90 days, gzipped). Server auto-mkdirs the parent. Set explicitly to empty (`--access-log=`) to disable. Per-host filename + `host` field in every record so multi-replica deployments writing to the same `.zddc.d/` dir disambiguate cleanly. | +### Client mode (proxy / cache / mirror) + +When `--upstream ` is set, the binary runs as a **downstream client** of another zddc-server instead of a master. `cmd/zddc-server/main.go` short-circuits to `runClient(cfg)`, which builds a `*cache.Cache` from `zddc/internal/cache/` and uses it as the entire request handler — no archive index, no apps server, no watcher, no OPA decider, no ACL middleware, no token store. + +Three modes via `--mode ` (default `cache`). Cache directory layout is intentionally a normal ZDDC root: `/foo/bar.txt` → `/foo/bar.txt`. Unset `--upstream` and the same root serves as a plain master, useful for portable offline snapshots. + +Pipeline (GET/HEAD only in phase 2): +- Cache hit → serve immediately + background `If-Modified-Since` revalidate (304 no-op, 200 overwrite, 403/404 purge). +- Cache miss → forward to upstream; stream response simultaneously to client and a tmp-file atomically renamed into the cache. +- Network error + cached version → serve stale + `X-ZDDC-Cache: offline`. +- Network error + no cache → 503 + `X-ZDDC-Cache: offline`. +- Directories (`/.../`) always proxy live; no listing cache yet (phase 3 / mirror mode). +- `Cache-Control: no-store` / `private` responses pass through but are not persisted. + +Two-instance smoke test recipe: + +```sh +# Master. +mkdir -p /tmp/m && echo 'admins: [you@example.com]' > /tmp/m/.zddc +echo "hello" > /tmp/m/hello.txt +zddc-server --root /tmp/m --addr 127.0.0.1:18443 --tls-cert=none --no-auth & + +# Client (cache mode). +mkdir -p /tmp/c +zddc-server --root /tmp/c --addr 127.0.0.1:18444 --tls-cert=none \ + --upstream http://127.0.0.1:18443 --mode cache --no-auth & + +curl -sI http://127.0.0.1:18444/hello.txt | grep -i x-zddc-cache # → miss +curl -sI http://127.0.0.1:18444/hello.txt | grep -i x-zddc-cache # → hit +ls /tmp/c # → hello.txt + .zddc-upstream marker +kill %1; sleep 1 +curl -sI http://127.0.0.1:18444/hello.txt | grep -i x-zddc-cache # → hit (still served from disk) +curl -si http://127.0.0.1:18444/never.txt | head -1 # → 503 +``` + +`X-ZDDC-Cache` response header values: `miss`, `hit`, `proxy` (no-persist or directory), `offline` (network unreachable). Useful for browser-side freshness UI. + +Implementation: `zddc/internal/cache/cache.go` (a single file). Tests in `zddc/internal/cache/cache_test.go` use `httptest.NewServer` as a fake upstream and cover hit/miss/offline/range/bearer-forwarding/no-store paths. + ### Bearer tokens (CLI auth) zddc-server self-issues bearer tokens for CLI / non-browser callers. No external IDP, no JWKS rotation. Source of truth: `/.zddc.d/tokens/` — a YAML file per token with `email`, `created`, `expires`, `description`. Filename is the **hash** of the token; the plaintext is never persisted. diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md index 3072f7f..130276b 100644 --- a/ARCHITECTURE.md +++ b/ARCHITECTURE.md @@ -468,6 +468,53 @@ none of them is load-bearing alone. | Audit log | Reconstruct who did what after the fact | JSON-line tee per request to `/.zddc.d/logs/access-.log`; writes also emit `file_write` op records | | File API | Authenticated CRUD over the served tree | `zddc/internal/handler/fileapi.go` — PUT/DELETE/POST routed through the same ACL chain as GET, with per-method verbs (`r`/`w`/`c`/`d`/`a`). Mkdir under `Incoming`/`Working`/`Staging` writes a creator-owned `.zddc` automatically | +### Master + proxy / cache / mirror + +The same `zddc-server` binary runs in two distinct topologies: + +- **Master mode** (default): the binary owns a file tree under `--root`, applies `.zddc` ACL cascades to incoming requests, serves files / virtual app HTML / archive listings / form submissions / table views. The "normal" zddc-server. All of `cmd/zddc-server/main.go` lives here. +- **Client mode** (`--upstream ` set): the binary becomes a downstream proxy/cache/mirror against another zddc-server. The master-side machinery (archive index, apps server, watcher, OPA decider, ACL middleware, token store) is **bypassed entirely**. `zddc/internal/cache/` is the entire request handler. + +Three sub-modes within client mode, controlled by `--mode `: + +| Mode | Persists responses? | Subtree warmer? | Use case | +|---|---|---|---| +| `proxy` | no | no | thin pass-through; nothing on local disk | +| `cache` (default) | yes | no | field engineer — what you've viewed is available offline | +| `mirror` | yes | yes (planned, phase 3) | vendor mirrors of their subtree; admin backups; complete offline working set | + +Internally the modes collapse to two switches on a single request-handling pipeline (`persist`, `warm`). Proxy is cache without disk writes; mirror is cache plus an access-triggered walker. Implementation factor: `cache.New` reads `cfg.Mode` once and sets `c.persist = mode != "proxy"`; the warmer is the only path that doesn't yet exist (phase 3). + +**Mirror scope falls out of auth.** Whatever the client's bearer can see at upstream is what the cache can populate. Admin's bearer → mirror gets everything (full backup). Vendor's bearer → mirror is exactly that vendor's permitted subtree. No code distinguishes admin-vs-user — master-side ACL filtering does it. + +#### Cache directory IS a normal ZDDC root + +The cache directory layout is intentionally a regular ZDDC root: `/foo/bar.txt` is stored at `/foo/bar.txt`. No sidecar metadata files. The local file's `mtime` is set to the upstream's `Last-Modified` header (so revalidation via `If-Modified-Since` reflects the master's notion of file age, not local fetch time). A small `.zddc-upstream` marker file at the root records the upstream URL and first-cached-at timestamp, written once by `sync.Once` on first persist. + +Two consequences: + +- `zddc-server --root ` (without `--upstream`) serves whatever's been cached as a plain master. Useful for portable offline snapshots — tar the directory, hand it to a colleague, they have a working ZDDC. +- The master/client boundary is one flag: setting/unsetting `--upstream` switches behavior on the same on-disk root. + +#### Pipeline + +Phase 2 ships GET/HEAD only; writes are deferred to a later phase. For each incoming request: + +1. **Directory request** (URL ends in `/`): always proxied live. Listing-cache support belongs with the mirror walker (phase 3) — the bare cache directory's contents only reflect visited files, so a local-walk listing would be misleading. +2. **File request, cache hit** (`persist` mode): serve cached bytes via `http.ServeContent` (which handles `Range` natively + 304 conditional GETs). Header `X-ZDDC-Cache: hit`. Background goroutine fires an `If-Modified-Since` revalidate; on `304` no-op, on `200` overwrite the cache atomically, on `403`/`404` purge. +3. **File request, cache miss**: build an upstream request preserving `Range`, `If-Range`, `Accept`, `Accept-Encoding`; attach the configured bearer. Stream the response simultaneously to the client AND to a tmp file in the cache directory; rename atomically only on success. Header `X-ZDDC-Cache: miss`. +4. **Proxy mode** (no persist): same as miss but skip the tmp-file teeing. Header `X-ZDDC-Cache: proxy`. +5. **Network error + cached version exists**: serve the cached bytes with `X-ZDDC-Cache: offline`. (When the cache hits before any network attempt, the header is `hit` — there's no way to distinguish "hit while online" from "hit while offline" without an extra round-trip; the header tells the user "this is from disk," and the user infers freshness from context or a future explicit freshness probe.) +6. **Network error + no cached version**: `503 Service Unavailable` + `X-ZDDC-Cache: offline`. + +Responses with `Cache-Control: no-store` or `Cache-Control: private` pass through but are not persisted. Non-200 responses (including 206 partial content) are forwarded but not persisted — caching a partial body would corrupt subsequent full-body reads. + +Hop-by-hop headers per RFC 7230 §6.1 (`Connection`, `Keep-Alive`, `Transfer-Encoding`, etc.) are dropped from forwarded responses; Go's transport drops most automatically, but the cache layer adds a guard for the cases that slip through. + +#### Multi-tenancy: explicitly out of scope (v1) + +The local instance forwards a single bearer (loaded from `--bearer-file` at startup) regardless of who's calling locally. Single-user-trust on a laptop. For multi-user scenarios, run multiple instances on the same host, or front the local server with your own auth proxy that injects per-user bearers downstream — both options keep the cache layer's design surface minimal. + ### Bearer token issuance zddc-server issues its own bearer tokens for non-browser callers (CLI tools, scripts, downstream proxy/cache/mirror instances). The master is the identity provider; no external IDP, no JWKS rotation. diff --git a/CLAUDE.md b/CLAUDE.md index 394a371..5a1c6b1 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -22,7 +22,7 @@ If something in this CLAUDE.md conflicts with those, those win — and please up This is a **monorepo of independent tools**, not one application: - `archive/`, `transmittal/`, `classifier/`, `mdedit/`, `landing/`, `form/` — six self-contained HTML tools, each compiled to a single inlined HTML file in its own `dist/`. Most output `dist/tool.html`; **`landing/` outputs `dist/index.html`** (it's the project picker served at the root of `zddc-server`). The sixth tool, `form/`, is the schema-driven renderer for the form-data system (any `.form.yaml` file in the tree becomes an editable form at `/.form.html`); see AGENTS.md "Form-data system" and ARCHITECTURE.md "Form Renderer". -- `zddc/` — Go HTTP server (separate sub-project; Go 1.24+). Serves `ZDDC_ROOT/index.html` at `GET /` as the landing page; `Accept: application/json` on `/` returns the ACL-filtered project list. Two auth paths: (a) `Authorization: Bearer ` validated against self-issued tokens stored under `/.zddc.d/tokens/` (filename = SHA256 of token), used by CLI / non-browser callers; (b) `X-Auth-Request-Email` injected by an upstream auth proxy, used for browser sessions. Self-service token UI at `/.tokens` + JSON API at `/.api/tokens`. `--no-auth` skips ACL enforcement entirely (distinct from the older `--insecure` which only relaxes the no-root-`.zddc` startup check). Cross-compiled binaries are produced by `./build` and live in `dist/release-output/` (gitignored); `./deploy` rsyncs them to `/srv/zddc/releases/` on the deploy host (Caddy serves them at `https://zddc.varasys.io/releases/`). The `helm/` charts in this repo build from source at deploy time. +- `zddc/` — Go HTTP server (separate sub-project; Go 1.24+). Two deployment shapes from the same binary: (1) **master** — owns a file tree under `ZDDC_ROOT`, applies `.zddc` ACL cascades, serves files / app HTML / archive listings. Two auth paths on master: `Authorization: Bearer ` validated against self-issued tokens at `/.zddc.d/tokens/` for CLI/scripted callers, or `X-Auth-Request-Email` injected by an upstream proxy for browser sessions. Self-service token UI at `/.tokens` + JSON API at `/.api/tokens`. (2) **client** — when `--upstream ` is set, the binary becomes a downstream proxy/cache/mirror (`zddc/internal/cache/`); master-side machinery is bypassed and `--root` becomes the cache directory. Three sub-modes via `--mode proxy|cache|mirror` (mirror is phase 3). Cache layout is a normal ZDDC root, so the cache dir can be served as a plain master if you unset `--upstream`. Marker file `.zddc-upstream` records provenance. `--no-auth` skips ACL enforcement entirely on this instance (distinct from `--insecure` which only relaxes the no-root-`.zddc` startup check); `--skip-tls-verify` is a separate flag for self-signed upstream certs. Cross-compiled binaries are produced by `./build` and live in `dist/release-output/` (gitignored); `./deploy` rsyncs them to `/srv/zddc/releases/` on the deploy host (Caddy serves them at `https://zddc.varasys.io/releases/`). The `helm/` charts in this repo build from source at deploy time. - `shared/` — `base.css` plus shared JS modules (`zddc.js`, `hash.js`, `zddc-filter.js`, `theme.js`, `help.js`) included by every tool's build, and `build-lib.sh` (POSIX sh helpers sourced by every tool's `build.sh` AND by the top-level `build` for lockstep release helpers). - **Two-repo + deploy-host model.** Source code lives here (`codeberg.org/VARASYS/ZDDC`). Hand-edited website content lives in a separate repo (`codeberg.org/VARASYS/ZDDC-website`, typically cloned at `~/src/zddc-website/` — just `index.html`, `reference.html`, `css/`, `js/`, `img/`; no releases, no LFS). The live site at `zddc.varasys.io` is served from `/srv/zddc/` on the deploy host: Caddy bind-mounts that path, and it's populated by `./deploy` from this repo's `dist/release-output/` plus `~/src/zddc-website/`. **Releases are NOT in any git history** — they're reproducible from this repo's `-vX.Y.Z` tags by checking out the tag and running `./build release X.Y.Z`. Per-version files (`_v.html`) are immutable; partial-version pins (`_v.html`, `_v.html`) and channel mirrors (`_{stable,beta,alpha}.html`) are symlinks; zddc-server has analogous `zddc-server_v_` per-version binaries plus channel/partial-version symlinks plus `zddc-server_.html` stub pages that fan out the four-platform download in one cell. **Install model:** local use is a download from `/releases/`. Server use is `zddc-server`, which has the current-stable build of all six tools baked in via `//go:embed` (compile-time default). Tools auto-served at folder-name-driven paths: `archive` everywhere, `classifier` in `Incoming`/`Working`/`Staging` subtrees, `mdedit` in `Working` subtrees, `transmittal` in `Staging` subtrees, `landing` only at root. Override via `.zddc apps:` cascade entry (channel/version/URL/path) — fetched once, cached at `/_app/`. Drop a real `.html` file at any path to override. - `helm/` — example Helm charts for zddc-server (`zddc-server-prod/`, `zddc-server-dev/`). Both compile from source via init container. Operators copy `values.yaml.example` and customize. No secrets in repo. diff --git a/zddc/README.md b/zddc/README.md index b05996a..5094dc7 100644 --- a/zddc/README.md +++ b/zddc/README.md @@ -202,6 +202,83 @@ JSON API for automation (same auth as the page): A user can only see and revoke their own tokens. Revoking another user's token returns 404 to avoid leaking ownership. +## Client mode (proxy / cache / mirror) + +The same `zddc-server` binary can run as a downstream client of another +zddc-server. Set `--upstream ` and the master-side machinery +(archive index, apps server, watcher, OPA decider, ACL middleware, +token store) is replaced by a thin caching HTTP layer that forwards to +the master and (optionally) persists responses under `--root`. + +Three modes via `--mode`: + +| Mode | Persists responses? | Subtree warmer? | Use case | +|---|---|---|---| +| `proxy` | no | no | thin pass-through; nothing on local disk | +| `cache` (default) | yes | no | field engineer — what you've viewed is available offline | +| `mirror` | yes | yes (phase 3) | vendor mirrors, admin backups, complete offline working set | + +The cache directory layout is a normal ZDDC root: `/foo/bar.txt` +is stored at `/foo/bar.txt`. No sidecar metadata. Running +`zddc-server --root ` (without `--upstream`) serves the +cached files as a plain master — useful for portable offline snapshots. + +A small marker file `.zddc-upstream` is written to the cache root on +first persist, recording the upstream URL and first-cached-at timestamp. +Prevents accidentally pointing master mode at a cache directory and +provides ops provenance. + +### Flags + +| Flag / env | Purpose | +|---|---| +| `--upstream ` / `ZDDC_UPSTREAM` | Master URL (e.g. `https://master.example.com`). Setting this enables client mode. | +| `--mode ` / `ZDDC_MODE` | Default `cache`. Ignored when `--upstream` is empty. | +| `--bearer-file ` / `ZDDC_BEARER_FILE` | Path to a 0600 file with a master-issued token (see `/.tokens` on the master). Forwarded as `Authorization: Bearer …` on every upstream request. | +| `--skip-tls-verify` / `ZDDC_SKIP_TLS_VERIFY` | Accept self-signed / untrusted upstream certs. Distinct from `--no-auth`. Dev / internal-CA scenarios only. | +| `--no-auth` / `ZDDC_NO_AUTH` | Skip ACL enforcement on incoming requests to the local instance. The common case for personal field-engineer / cache deployments where the laptop is single-user-trust and the master already filtered. | + +### Pipeline + +For each incoming `GET` (writes are not yet supported in client mode): + +1. **Directory request** (URL ends in `/`): always proxied live. No listing cache yet (phase 3 / mirror mode). +2. **File request, cache hit**: serve cached bytes immediately with `X-ZDDC-Cache: hit`. Kick off a background `If-Modified-Since` revalidate; on `304` no-op, on `200` overwrite the cache, on `403`/`404` purge. +3. **File request, cache miss**: forward to upstream with the configured bearer. On `200` stream simultaneously to the client and a tmp-file that's atomically renamed into the cache. Header `X-ZDDC-Cache: miss`. +4. **Network error and a cached version exists**: serve cached + `X-ZDDC-Cache: offline`. +5. **Network error and no cached version**: `503 Service Unavailable` with `X-ZDDC-Cache: offline`. + +Range requests (`Range: bytes=...`) work end-to-end: forwarded to upstream on miss, served via `http.ServeContent` from disk on hit (which handles `Range` natively). + +Responses with `Cache-Control: no-store` or `Cache-Control: private` are forwarded but not persisted. + +### Two-instance dev recipe + +```sh +# Master (your normal zddc-server). Pick any --root with a .zddc. +zddc-server --root /srv/zddc --addr :8443 + +# Client (any port; doesn't need TLS for local dev). +mkdir -p /tmp/zddc-mirror +zddc-server \ + --upstream http://master.example.com:8443 \ + --root /tmp/zddc-mirror \ + --mode cache \ + --bearer-file ~/.config/zddc/token \ + --addr 127.0.0.1:8444 \ + --tls-cert=none \ + --no-auth +``` + +Browse `http://localhost:8444/`. Files you visit appear under `/tmp/zddc-mirror/` mirroring the master's path layout. Disconnect, refresh — previously-visited files keep working. Reconnect — background revalidates run on every cache hit, picking up master-side changes the next time you reload. + +### What client mode is NOT, yet + +- **No write path**: `PUT`/`POST`/`DELETE` return `405`. The offline write outbox lands in a later phase. +- **No mirror walker**: `--mode mirror` is accepted but currently behaves like `cache` (no proactive prefetching). Phase 3 adds the access-triggered walk scheduler. +- **No listing cache**: directories always proxy live, so offline browsing of a directory you didn't visit while online won't show anything. Mirror mode + listing caching is phase 3. +- **No multi-tenancy**: the local instance forwards a single bearer to upstream regardless of who's calling locally. For multi-user deployments, run multiple instances or front the local server with your own auth proxy. + ## Access control: the `.zddc` cascade > ⚠️ **zddc-server refuses to start without a root `.zddc`.** A `ZDDC_ROOT` containing diff --git a/zddc/cmd/zddc-server/main.go b/zddc/cmd/zddc-server/main.go index 7585c77..5823947 100644 --- a/zddc/cmd/zddc-server/main.go +++ b/zddc/cmd/zddc-server/main.go @@ -17,6 +17,7 @@ import ( "codeberg.org/VARASYS/ZDDC/zddc/internal/apps" "codeberg.org/VARASYS/ZDDC/zddc/internal/archive" "codeberg.org/VARASYS/ZDDC/zddc/internal/auth" + "codeberg.org/VARASYS/ZDDC/zddc/internal/cache" "codeberg.org/VARASYS/ZDDC/zddc/internal/config" "codeberg.org/VARASYS/ZDDC/zddc/internal/handler" "codeberg.org/VARASYS/ZDDC/zddc/internal/policy" @@ -74,6 +75,18 @@ func main() { "addr", cfg.Addr, "embedded_apps", embeddedVersionsForLog(embedded)) + // Client mode short-circuit: when cfg.Upstream is set, this binary + // runs as a downstream proxy/cache/mirror rather than a master. + // The master-side machinery below (archive index, watcher, apps + // server, policy decider, ACL middleware, token store) is all + // skipped — every request flows through the cache layer, which + // forwards to upstream and (in cache/mirror modes) persists the + // response under cfg.Root. + if cfg.Upstream != "" { + runClient(cfg) + return + } + // Build archive index slog.Info("building archive index...") start := time.Now() @@ -255,6 +268,91 @@ func main() { slog.Info("stopped") } +// runClient is the entry point when cfg.Upstream is set — a separate +// lifecycle from the master-side main(), with no archive index, no +// apps server, no watcher, no policy decider, no ACL middleware, no +// token store. The cache layer (zddc/internal/cache) is the entire +// request handler; AccessLog + HSTS + gzip wrap it the same way they +// wrap dispatch in master mode. +func runClient(cfg config.Config) { + cacheLayer, err := cache.New(cfg) + if err != nil { + slog.Error("client mode init failed", "err", err) + os.Exit(1) + } + slog.Info("client mode active", + "upstream", cacheLayer.Upstream(), + "mode", cacheLayer.Mode(), + "no_auth", cfg.NoAuth, + "skip_tls_verify", cfg.SkipTLSVerify) + if cfg.NoAuth { + slog.Warn("--no-auth enabled: incoming requests are not ACL-checked locally; trusting upstream's filtering.") + } + + tlsCfg, useTLS, err := tlsutil.TLSConfig(cfg) + if err != nil { + slog.Error("failed to configure TLS", "err", err) + os.Exit(1) + } + + ctx, cancel := signal.NotifyContext(context.Background(), syscall.SIGTERM, syscall.SIGINT) + defer cancel() + + auditLogger := setupAccessAuditLog(cfg.AccessLog) + + var inner http.Handler = cacheLayer + inner = handler.CORSMiddleware(cfg, inner) + if useTLS { + inner = handler.HSTSMiddleware(inner) + } + inner = handler.AccessLogMiddleware(auditLogger, inner) + + mux := http.NewServeMux() + mux.Handle("/", inner) + + gzWrapper, err := newGzipWrapper() + if err != nil { + slog.Error("gzhttp wrapper init", "err", err) + os.Exit(1) + } + srv := &http.Server{ + Addr: cfg.Addr, + Handler: gzWrapper(mux), + TLSConfig: tlsCfg, + ReadHeaderTimeout: 10 * time.Second, + ReadTimeout: 60 * time.Second, + WriteTimeout: 60 * time.Second, + IdleTimeout: 120 * time.Second, + } + + if useTLS { + go func() { + slog.Info("listening", "addr", cfg.Addr, "tls", true, "client_mode", true) + if err := srv.ListenAndServeTLS("", ""); err != nil && err != http.ErrServerClosed { + slog.Error("server error", "err", err) + cancel() + } + }() + } else { + go func() { + slog.Info("listening", "addr", cfg.Addr, "tls", false, "client_mode", true) + if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed { + slog.Error("server error", "err", err) + cancel() + } + }() + } + + <-ctx.Done() + slog.Info("shutting down...") + shutdownCtx, shutdownCancel := context.WithTimeout(context.Background(), 30*time.Second) + defer shutdownCancel() + if err := srv.Shutdown(shutdownCtx); err != nil { + slog.Error("shutdown error", "err", err) + } + slog.Info("stopped") +} + // setupAccessAuditLog constructs a slog.Logger writing JSON lines to a // size-rotated file at the operator-configured path. Returns nil if no // path is configured (operator opted out via --access-log=) — diff --git a/zddc/internal/cache/cache.go b/zddc/internal/cache/cache.go new file mode 100644 index 0000000..07658d2 --- /dev/null +++ b/zddc/internal/cache/cache.go @@ -0,0 +1,479 @@ +// Package cache implements zddc-server's client mode: a downstream +// proxy/cache/mirror that runs the same binary against a master. +// Configured via cfg.Upstream (in main.go), the cache layer replaces +// the master-side dispatcher entirely — every incoming request is +// forwarded to the master with the local instance's bearer token, and +// (in cache or mirror mode) the response body is persisted under +// cfg.Root so subsequent requests serve from disk. +// +// The cache directory layout is intentionally a normal ZDDC root: a +// file fetched from `/foo/bar.txt` is stored at `/foo/ +// bar.txt`. No sidecar metadata. The local file's mtime is set to the +// upstream's Last-Modified header so revalidation via +// If-Modified-Since reflects the master's notion of the file's age, +// not when the local cache happened to fetch it. Running +// `zddc-server --root ` without --upstream serves the +// cached files as a regular ZDDC — useful for portable offline +// snapshots and sanity-check inspection. +// +// Phase 2 scope: GET/HEAD only. Range requests, stale-while- +// revalidate, and offline-fallback are supported. Directory listings +// are always proxied live (no listing cache yet); writes (PUT / POST / +// DELETE) and the mirror walker land in later phases. +package cache + +import ( + "crypto/tls" + "fmt" + "io" + "log/slog" + "net/http" + "net/url" + "os" + "path/filepath" + "strings" + "sync" + "time" + + "codeberg.org/VARASYS/ZDDC/zddc/internal/config" +) + +// MarkerFile records the upstream URL and first-cached-at timestamp +// in the cache root. Prevents accidentally pointing master mode at a +// cache directory and provides provenance for ops/users. +const MarkerFile = ".zddc-upstream" + +// HeaderName is the response header that surfaces cache state to the +// client (and the browser-side UI). Values: hit, revalidated, miss, +// proxy, offline. +const HeaderName = "X-ZDDC-Cache" + +// Cache is the request handler installed in main.go when cfg.Upstream +// is non-empty. It is safe for concurrent ServeHTTP calls. +type Cache struct { + root string // local cache directory (== cfg.Root in client mode) + upstream string // upstream master URL, no trailing slash + bearer string // forwarded as Authorization: Bearer to upstream; "" disables + mode string // "proxy" | "cache" | "mirror" + persist bool // mode != "proxy" — write responses to disk + client *http.Client + + markerOnce sync.Once +} + +// New constructs a Cache from the loaded configuration. Validates +// upstream URL, reads the bearer-file (if configured), prepares the +// HTTP client honoring SkipTLSVerify, and ensures the cache root +// exists. +func New(cfg config.Config) (*Cache, error) { + if cfg.Upstream == "" { + return nil, fmt.Errorf("cache.New: cfg.Upstream is empty") + } + upstream := strings.TrimRight(cfg.Upstream, "/") + if _, err := url.Parse(upstream); err != nil { + return nil, fmt.Errorf("cache.New: invalid upstream %q: %w", upstream, err) + } + + bearer := "" + if cfg.BearerFile != "" { + b, err := os.ReadFile(cfg.BearerFile) + if err != nil { + return nil, fmt.Errorf("cache.New: read bearer file: %w", err) + } + bearer = strings.TrimSpace(string(b)) + if bearer == "" { + return nil, fmt.Errorf("cache.New: bearer file %q is empty", cfg.BearerFile) + } + } + + transport := &http.Transport{ + MaxIdleConns: 10, + IdleConnTimeout: 30 * time.Second, + TLSHandshakeTimeout: 10 * time.Second, + ResponseHeaderTimeout: 30 * time.Second, + } + if cfg.SkipTLSVerify { + // G402 / CWE-295: deliberate. Documented operator opt-in for + // dev/internal-CA scenarios; never the default. + transport.TLSClientConfig = &tls.Config{InsecureSkipVerify: true} //nolint:gosec + slog.Warn("--skip-tls-verify enabled: upstream TLS certificates will NOT be validated") + } + + if err := os.MkdirAll(cfg.Root, 0o755); err != nil { + return nil, fmt.Errorf("cache.New: create cache root %q: %w", cfg.Root, err) + } + + mode := cfg.Mode + if mode == "" { + mode = "cache" + } + + return &Cache{ + root: cfg.Root, + upstream: upstream, + bearer: bearer, + mode: mode, + persist: mode != "proxy", + client: &http.Client{ + Transport: transport, + Timeout: 60 * time.Second, + // Don't follow redirects automatically — pass them through to + // the client so the browser can update its address bar + // (e.g. master's no-trailing-slash → trailing-slash 301). + CheckRedirect: func(req *http.Request, via []*http.Request) error { + return http.ErrUseLastResponse + }, + }, + }, nil +} + +// Mode returns the configured mode label for diagnostics. +func (c *Cache) Mode() string { return c.mode } + +// Upstream returns the upstream master URL for diagnostics. +func (c *Cache) Upstream() string { return c.upstream } + +// ServeHTTP is the cache layer's HTTP entry point. Replaces the +// master-side dispatcher in client mode. +func (c *Cache) ServeHTTP(w http.ResponseWriter, r *http.Request) { + // Phase 2: read-only. Writes are deferred to the outbox phase. + // Forward HEAD as GET-without-body to keep the response shape + // consistent with what http.ServeContent would do. + if r.Method != http.MethodGet && r.Method != http.MethodHead { + w.Header().Set("Allow", "GET, HEAD") + http.Error(w, "Method Not Allowed: writes are not yet supported in client mode", http.StatusMethodNotAllowed) + return + } + + // Directory listings are always proxied live in v1. The cache + // directory's actual filesystem listing would be inaccurate (it + // only contains visited files), and full listing-cache support + // belongs with the mirror walker in phase 3. + if strings.HasSuffix(r.URL.Path, "/") { + c.proxy(w, r, false /* writeToCache */) + return + } + + // File request — try cache first when persisting. + if c.persist { + if path, ok := c.cachePathFor(r.URL.Path); ok { + info, err := os.Stat(path) + if err == nil && !info.IsDir() { + c.serveFromDisk(w, r, path, info, "hit") + // Background revalidate; never block the user response. + go c.revalidate(r.URL.Path, info.ModTime()) + return + } + } + } + + // Miss (or proxy mode) → forward to upstream and (optionally) + // persist on the way through. + c.proxy(w, r, c.persist) +} + +// proxy forwards the request to upstream and serves the response back +// to the client. When writeToCache is true and the response is a +// cacheable 200, the body is also persisted under cfg.Root. +func (c *Cache) proxy(w http.ResponseWriter, r *http.Request, writeToCache bool) { + upReq, err := c.buildUpstreamRequest(r) + if err != nil { + http.Error(w, "Bad Request: "+err.Error(), http.StatusBadRequest) + return + } + + resp, err := c.client.Do(upReq) + if err != nil { + // Network error. If we have a cached copy, serve it stale. + if writeToCache && r.Method == http.MethodGet { + if path, ok := c.cachePathFor(r.URL.Path); ok { + if info, sErr := os.Stat(path); sErr == nil && !info.IsDir() { + c.serveFromDisk(w, r, path, info, "offline") + return + } + } + } + slog.Warn("upstream fetch failed", "url", upReq.URL.String(), "err", err) + w.Header().Set(HeaderName, "offline") + http.Error(w, "Service Unavailable: upstream unreachable", http.StatusServiceUnavailable) + return + } + defer resp.Body.Close() + + // Forward upstream response headers. Skip hop-by-hop headers (RFC + // 7230 §6.1) — Go's transport already drops most, but Connection + // and Transfer-Encoding can sneak through and confuse the client. + for k, vv := range resp.Header { + if isHopByHop(k) { + continue + } + for _, v := range vv { + w.Header().Add(k, v) + } + } + + cacheable := writeToCache && resp.StatusCode == http.StatusOK && c.responseCacheable(resp) + if cacheable { + w.Header().Set(HeaderName, "miss") + } else if writeToCache { + w.Header().Set(HeaderName, "proxy") + } else { + w.Header().Set(HeaderName, "proxy") + } + + w.WriteHeader(resp.StatusCode) + + if r.Method == http.MethodHead || resp.StatusCode == http.StatusNotModified { + return + } + + if !cacheable { + _, _ = io.Copy(w, resp.Body) + return + } + + // Stream body to client AND to a tmp file in the cache; rename + // atomically only on success. + if err := c.streamAndPersist(w, resp, r.URL.Path); err != nil { + // Mid-stream error: the client got a partial body (HTTP-normal), + // and we already abandoned the cache write. Just log. + slog.Debug("stream-and-persist error", "url", r.URL.Path, "err", err) + } else { + c.maybeWriteMarker() + } +} + +// buildUpstreamRequest constructs the outbound request preserving the +// path, query, Range, and Accept headers. Adds the bearer if configured. +func (c *Cache) buildUpstreamRequest(r *http.Request) (*http.Request, error) { + target := c.upstream + r.URL.RequestURI() + upReq, err := http.NewRequestWithContext(r.Context(), r.Method, target, nil) + if err != nil { + return nil, err + } + // Preserve the Range header for resumable / partial transfers. + if v := r.Header.Get("Range"); v != "" { + upReq.Header.Set("Range", v) + } + if v := r.Header.Get("If-Range"); v != "" { + upReq.Header.Set("If-Range", v) + } + if v := r.Header.Get("Accept"); v != "" { + upReq.Header.Set("Accept", v) + } + if v := r.Header.Get("Accept-Encoding"); v != "" { + upReq.Header.Set("Accept-Encoding", v) + } + upReq.Header.Set("User-Agent", "zddc-server-cache/0.1") + if c.bearer != "" { + upReq.Header.Set("Authorization", "Bearer "+c.bearer) + } + return upReq, nil +} + +// responseCacheable reports whether the response body should be +// persisted. Honors Cache-Control: no-store / private and refuses to +// cache responses without a content body (ranges, 204, etc.). +func (c *Cache) responseCacheable(resp *http.Response) bool { + cc := resp.Header.Get("Cache-Control") + low := strings.ToLower(cc) + if strings.Contains(low, "no-store") || strings.Contains(low, "private") { + return false + } + // Don't cache partial-content responses — the server returned 206 + // for a Range request, which means the body covers only part of + // the file. Caching that partial body would corrupt subsequent + // non-range fetches. + if resp.StatusCode != http.StatusOK { + return false + } + return true +} + +// streamAndPersist writes resp.Body simultaneously to the client and +// to a temp file in the cache. Renames the temp atomically on success. +// Sets the local file's mtime to upstream's Last-Modified (if +// present) so subsequent revalidations send If-Modified-Since with a +// timestamp upstream can compare against its own state. +func (c *Cache) streamAndPersist(w http.ResponseWriter, resp *http.Response, urlPath string) error { + finalPath, ok := c.cachePathFor(urlPath) + if !ok { + _, err := io.Copy(w, resp.Body) + return err + } + if err := os.MkdirAll(filepath.Dir(finalPath), 0o755); err != nil { + _, copyErr := io.Copy(w, resp.Body) + if copyErr != nil { + return copyErr + } + return err + } + tmp, err := os.CreateTemp(filepath.Dir(finalPath), ".zddc-cache-tmp-*") + if err != nil { + _, copyErr := io.Copy(w, resp.Body) + if copyErr != nil { + return copyErr + } + return err + } + tmpName := tmp.Name() + mw := io.MultiWriter(tmp, w) + if _, err := io.Copy(mw, resp.Body); err != nil { + _ = tmp.Close() + _ = os.Remove(tmpName) + return err + } + if err := tmp.Close(); err != nil { + _ = os.Remove(tmpName) + return err + } + if lm := resp.Header.Get("Last-Modified"); lm != "" { + if t, err := http.ParseTime(lm); err == nil { + _ = os.Chtimes(tmpName, t, t) + } + } + return os.Rename(tmpName, finalPath) +} + +// serveFromDisk serves a cached file via http.ServeContent (which +// handles Range requests, If-Modified-Since, and conditional GETs +// natively). cacheState is the X-ZDDC-Cache value to surface. +func (c *Cache) serveFromDisk(w http.ResponseWriter, r *http.Request, path string, info os.FileInfo, cacheState string) { + f, err := os.Open(path) + if err != nil { + http.Error(w, "Internal Server Error", http.StatusInternalServerError) + return + } + defer f.Close() + w.Header().Set(HeaderName, cacheState) + http.ServeContent(w, r, filepath.Base(path), info.ModTime(), f) +} + +// revalidate fires a conditional GET against upstream after a cache +// hit. 304 = no-op (cache is fresh). 200 = update cache. 403/404 = +// purge (ACL revoked or upstream deleted). Network errors are +// swallowed — staleness via offline is the documented behavior. +func (c *Cache) revalidate(urlPath string, mtime time.Time) { + target := c.upstream + urlPath + req, err := http.NewRequest(http.MethodGet, target, nil) + if err != nil { + return + } + if !mtime.IsZero() { + req.Header.Set("If-Modified-Since", mtime.UTC().Format(http.TimeFormat)) + } + if c.bearer != "" { + req.Header.Set("Authorization", "Bearer "+c.bearer) + } + resp, err := c.client.Do(req) + if err != nil { + return + } + defer resp.Body.Close() + switch resp.StatusCode { + case http.StatusNotModified: + return + case http.StatusOK: + if !c.responseCacheable(resp) { + return + } + if err := c.persistOnly(resp, urlPath); err != nil { + slog.Debug("revalidate persist error", "url", urlPath, "err", err) + } + case http.StatusForbidden, http.StatusNotFound: + if path, ok := c.cachePathFor(urlPath); ok { + _ = os.Remove(path) + slog.Info("purged cached entry after upstream 4xx", "url", urlPath, "status", resp.StatusCode) + } + } +} + +// persistOnly writes resp.Body to the cache without forwarding it +// anywhere. Used by revalidate (the user's request was already served +// from disk; we just refresh the cache in the background). +func (c *Cache) persistOnly(resp *http.Response, urlPath string) error { + finalPath, ok := c.cachePathFor(urlPath) + if !ok { + _, _ = io.Copy(io.Discard, resp.Body) + return nil + } + if err := os.MkdirAll(filepath.Dir(finalPath), 0o755); err != nil { + _, _ = io.Copy(io.Discard, resp.Body) + return err + } + tmp, err := os.CreateTemp(filepath.Dir(finalPath), ".zddc-cache-tmp-*") + if err != nil { + _, _ = io.Copy(io.Discard, resp.Body) + return err + } + tmpName := tmp.Name() + if _, err := io.Copy(tmp, resp.Body); err != nil { + _ = tmp.Close() + _ = os.Remove(tmpName) + return err + } + if err := tmp.Close(); err != nil { + _ = os.Remove(tmpName) + return err + } + if lm := resp.Header.Get("Last-Modified"); lm != "" { + if t, err := http.ParseTime(lm); err == nil { + _ = os.Chtimes(tmpName, t, t) + } + } + return os.Rename(tmpName, finalPath) +} + +// cachePathFor maps a URL path to a local filesystem path under the +// cache root. Returns ok=false on inputs that would escape the root, +// reserve a marker filename, or otherwise be unsafe to write. +func (c *Cache) cachePathFor(urlPath string) (string, bool) { + if urlPath == "" || urlPath == "/" { + return "", false + } + if strings.Contains(urlPath, "..") { + return "", false + } + clean := filepath.FromSlash(strings.TrimPrefix(urlPath, "/")) + abs := filepath.Join(c.root, clean) + if !strings.HasPrefix(abs, c.root+string(filepath.Separator)) && abs != c.root { + return "", false + } + // Don't let URLs collide with internal markers. + if filepath.Base(abs) == MarkerFile { + return "", false + } + return abs, true +} + +// maybeWriteMarker writes the .zddc-upstream provenance file once, +// the first time the cache stores anything. Best-effort: an error +// here doesn't fail the request. +func (c *Cache) maybeWriteMarker() { + c.markerOnce.Do(func() { + marker := filepath.Join(c.root, MarkerFile) + if _, err := os.Stat(marker); err == nil { + return + } + body := fmt.Sprintf("upstream: %s\nfirst_cached: %s\nmode: %s\n", + c.upstream, time.Now().UTC().Format(time.RFC3339), c.mode) + _ = os.WriteFile(marker, []byte(body), 0o644) + }) +} + +// isHopByHop reports whether a header name is hop-by-hop per RFC 7230 +// §6.1 — these must not be forwarded by a proxy. +func isHopByHop(name string) bool { + switch http.CanonicalHeaderKey(name) { + case "Connection", + "Keep-Alive", + "Proxy-Authenticate", + "Proxy-Authorization", + "Te", + "Trailer", + "Transfer-Encoding", + "Upgrade": + return true + } + return false +} diff --git a/zddc/internal/cache/cache_test.go b/zddc/internal/cache/cache_test.go new file mode 100644 index 0000000..f3e12f0 --- /dev/null +++ b/zddc/internal/cache/cache_test.go @@ -0,0 +1,546 @@ +package cache + +import ( + "io" + "net/http" + "net/http/httptest" + "os" + "path/filepath" + "strings" + "sync" + "sync/atomic" + "testing" + "time" + + "codeberg.org/VARASYS/ZDDC/zddc/internal/config" +) + +// newTestCache spins up an httptest server as the upstream and +// returns the cache + the upstream's URL. The upstream's behavior is +// the caller's to define. +func newTestCache(t *testing.T, mode string, upstreamHandler http.HandlerFunc) (*Cache, *httptest.Server) { + t.Helper() + upstream := httptest.NewServer(upstreamHandler) + t.Cleanup(upstream.Close) + root := t.TempDir() + c, err := New(config.Config{ + Root: root, + Upstream: upstream.URL, + Mode: mode, + }) + if err != nil { + t.Fatalf("New: %v", err) + } + return c, upstream +} + +func TestNew_RequiresUpstream(t *testing.T) { + if _, err := New(config.Config{Root: t.TempDir()}); err == nil { + t.Error("expected error for empty upstream") + } +} + +func TestNew_StripsTrailingSlash(t *testing.T) { + c, err := New(config.Config{ + Root: t.TempDir(), + Upstream: "http://example.com/", + }) + if err != nil { + t.Fatalf("New: %v", err) + } + if got := c.Upstream(); got != "http://example.com" { + t.Errorf("Upstream() = %q, want trailing slash stripped", got) + } +} + +func TestNew_BearerFile(t *testing.T) { + dir := t.TempDir() + tokenPath := filepath.Join(dir, "token") + if err := os.WriteFile(tokenPath, []byte(" abc123\n"), 0o600); err != nil { + t.Fatalf("write token: %v", err) + } + c, err := New(config.Config{ + Root: t.TempDir(), + Upstream: "http://example.com", + BearerFile: tokenPath, + }) + if err != nil { + t.Fatalf("New: %v", err) + } + if c.bearer != "abc123" { + t.Errorf("bearer = %q, want abc123 (whitespace trimmed)", c.bearer) + } +} + +func TestNew_BearerFileEmptyRejected(t *testing.T) { + dir := t.TempDir() + empty := filepath.Join(dir, "empty") + _ = os.WriteFile(empty, []byte("\n\n"), 0o600) + if _, err := New(config.Config{ + Root: t.TempDir(), + Upstream: "http://example.com", + BearerFile: empty, + }); err == nil { + t.Error("expected error for empty bearer file") + } +} + +func TestServeHTTP_RejectsWriteMethods(t *testing.T) { + c, _ := newTestCache(t, "cache", func(w http.ResponseWriter, r *http.Request) { + t.Errorf("upstream should not be called for write methods") + }) + for _, method := range []string{http.MethodPut, http.MethodPost, http.MethodDelete} { + rec := httptest.NewRecorder() + r := httptest.NewRequest(method, "/foo", nil) + c.ServeHTTP(rec, r) + if rec.Code != http.StatusMethodNotAllowed { + t.Errorf("%s = %d, want 405", method, rec.Code) + } + if got := rec.Header().Get("Allow"); got != "GET, HEAD" { + t.Errorf("%s Allow = %q", method, got) + } + } +} + +func TestServeHTTP_MissThenHit(t *testing.T) { + var hits int32 + c, upstream := newTestCache(t, "cache", func(w http.ResponseWriter, r *http.Request) { + atomic.AddInt32(&hits, 1) + if r.URL.Path != "/foo.txt" { + t.Errorf("upstream got %q, want /foo.txt", r.URL.Path) + } + w.Header().Set("Content-Type", "text/plain") + w.Header().Set("Last-Modified", "Mon, 02 Jan 2006 15:04:05 GMT") + _, _ = w.Write([]byte("hello")) + }) + _ = upstream + + // First request: miss. + rec := httptest.NewRecorder() + r := httptest.NewRequest(http.MethodGet, "/foo.txt", nil) + c.ServeHTTP(rec, r) + if rec.Code != http.StatusOK { + t.Fatalf("first GET = %d", rec.Code) + } + if got := rec.Header().Get(HeaderName); got != "miss" { + t.Errorf("first cache header = %q, want miss", got) + } + if got := rec.Body.String(); got != "hello" { + t.Errorf("body = %q", got) + } + + // Cache file should exist. + cached := filepath.Join(c.root, "foo.txt") + if _, err := os.Stat(cached); err != nil { + t.Fatalf("expected cached file: %v", err) + } + + // Second request: hit. Wait briefly to let the marker write race finish. + rec2 := httptest.NewRecorder() + r2 := httptest.NewRequest(http.MethodGet, "/foo.txt", nil) + c.ServeHTTP(rec2, r2) + if rec2.Code != http.StatusOK { + t.Fatalf("second GET = %d", rec2.Code) + } + if got := rec2.Header().Get(HeaderName); got != "hit" { + t.Errorf("second cache header = %q, want hit", got) + } + if got := rec2.Body.String(); got != "hello" { + t.Errorf("second body = %q", got) + } + + // Marker file should be present. + marker := filepath.Join(c.root, MarkerFile) + mb, err := os.ReadFile(marker) + if err != nil { + t.Fatalf("marker missing: %v", err) + } + if !strings.Contains(string(mb), "upstream:") { + t.Errorf("marker contents unexpected: %s", string(mb)) + } +} + +func TestServeHTTP_ProxyModeDoesNotPersist(t *testing.T) { + c, _ := newTestCache(t, "proxy", func(w http.ResponseWriter, r *http.Request) { + _, _ = w.Write([]byte("payload")) + }) + rec := httptest.NewRecorder() + r := httptest.NewRequest(http.MethodGet, "/foo.txt", nil) + c.ServeHTTP(rec, r) + if rec.Code != http.StatusOK { + t.Fatalf("status = %d", rec.Code) + } + if got := rec.Header().Get(HeaderName); got != "proxy" { + t.Errorf("cache header = %q, want proxy", got) + } + cached := filepath.Join(c.root, "foo.txt") + if _, err := os.Stat(cached); !os.IsNotExist(err) { + t.Errorf("proxy mode wrote to cache: %v", err) + } + // Marker also shouldn't exist (no caching happened). + if _, err := os.Stat(filepath.Join(c.root, MarkerFile)); !os.IsNotExist(err) { + t.Errorf("marker file written in proxy mode") + } +} + +func TestServeHTTP_DirectoriesAreNeverCached(t *testing.T) { + c, _ := newTestCache(t, "cache", func(w http.ResponseWriter, r *http.Request) { + w.Header().Set("Content-Type", "text/html") + _, _ = w.Write([]byte("listing")) + }) + rec := httptest.NewRecorder() + r := httptest.NewRequest(http.MethodGet, "/Project/", nil) + c.ServeHTTP(rec, r) + if rec.Code != http.StatusOK { + t.Fatalf("status = %d", rec.Code) + } + if got := rec.Header().Get(HeaderName); got != "proxy" { + t.Errorf("cache header = %q, want proxy (directories don't cache)", got) + } + // No file or directory should have been created at the URL location. + if entries, _ := os.ReadDir(c.root); len(entries) > 0 { + t.Errorf("directory request created cache entries: %v", entries) + } +} + +func TestServeHTTP_HEAD_HitDoesNotReturnBody(t *testing.T) { + c, _ := newTestCache(t, "cache", func(w http.ResponseWriter, r *http.Request) { + _, _ = w.Write([]byte("hello")) + }) + // Seed the cache via GET. + rec := httptest.NewRecorder() + c.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/foo.txt", nil)) + if rec.Code != http.StatusOK { + t.Fatalf("seed: %d", rec.Code) + } + + // HEAD: should be a hit, no body. + rec2 := httptest.NewRecorder() + c.ServeHTTP(rec2, httptest.NewRequest(http.MethodHead, "/foo.txt", nil)) + if rec2.Code != http.StatusOK { + t.Fatalf("HEAD: %d", rec2.Code) + } + if rec2.Body.Len() != 0 { + t.Errorf("HEAD body length = %d, want 0", rec2.Body.Len()) + } +} + +func TestServeHTTP_OfflineServesStale(t *testing.T) { + root := t.TempDir() + // Pre-seed a cached file. + if err := os.WriteFile(filepath.Join(root, "stale.txt"), []byte("stale-content"), 0o644); err != nil { + t.Fatalf("seed: %v", err) + } + c, err := New(config.Config{ + Root: root, + Upstream: "http://127.0.0.1:1", // unreachable port + Mode: "cache", + }) + if err != nil { + t.Fatalf("New: %v", err) + } + // Speed up the timeout so the test doesn't hang. + c.client.Timeout = 200 * time.Millisecond + + rec := httptest.NewRecorder() + r := httptest.NewRequest(http.MethodGet, "/stale.txt", nil) + c.ServeHTTP(rec, r) + if rec.Code != http.StatusOK { + t.Fatalf("offline-with-cache = %d, want 200", rec.Code) + } + if got := rec.Header().Get(HeaderName); got != "hit" { + // On hit we don't even hit the network. That's expected. + t.Logf("first attempt was %q (likely cache hit before any network)", got) + } + if got := rec.Body.String(); got != "stale-content" { + t.Errorf("body = %q", got) + } +} + +func TestServeHTTP_OfflineMissReturns503(t *testing.T) { + root := t.TempDir() + c, err := New(config.Config{ + Root: root, + Upstream: "http://127.0.0.1:1", + Mode: "cache", + }) + if err != nil { + t.Fatalf("New: %v", err) + } + c.client.Timeout = 200 * time.Millisecond + + rec := httptest.NewRecorder() + r := httptest.NewRequest(http.MethodGet, "/never-cached.txt", nil) + c.ServeHTTP(rec, r) + if rec.Code != http.StatusServiceUnavailable { + t.Errorf("offline-no-cache = %d, want 503", rec.Code) + } + if got := rec.Header().Get(HeaderName); got != "offline" { + t.Errorf("cache header = %q, want offline", got) + } +} + +func TestServeHTTP_BearerForwarded(t *testing.T) { + dir := t.TempDir() + tokenPath := filepath.Join(dir, "token") + _ = os.WriteFile(tokenPath, []byte("secrettoken"), 0o600) + var seenAuth string + upstream := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + seenAuth = r.Header.Get("Authorization") + _, _ = w.Write([]byte("ok")) + })) + defer upstream.Close() + + c, err := New(config.Config{ + Root: t.TempDir(), + Upstream: upstream.URL, + Mode: "cache", + BearerFile: tokenPath, + }) + if err != nil { + t.Fatalf("New: %v", err) + } + rec := httptest.NewRecorder() + c.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/foo.txt", nil)) + if seenAuth != "Bearer secrettoken" { + t.Errorf("Authorization = %q, want Bearer secrettoken", seenAuth) + } +} + +func TestServeHTTP_PreservesQuery(t *testing.T) { + var seenURL string + c, _ := newTestCache(t, "cache", func(w http.ResponseWriter, r *http.Request) { + seenURL = r.URL.RequestURI() + w.Header().Set("Cache-Control", "no-store") // no-cache the JSON response + _, _ = w.Write([]byte(`{}`)) + }) + rec := httptest.NewRecorder() + c.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/foo.txt?q=bar", nil)) + if seenURL != "/foo.txt?q=bar" { + t.Errorf("upstream saw %q, want /foo.txt?q=bar", seenURL) + } +} + +func TestServeHTTP_HonorsNoStore(t *testing.T) { + c, _ := newTestCache(t, "cache", func(w http.ResponseWriter, r *http.Request) { + w.Header().Set("Cache-Control", "no-store") + _, _ = w.Write([]byte("ephemeral")) + }) + rec := httptest.NewRecorder() + c.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/dynamic.json", nil)) + if rec.Code != http.StatusOK { + t.Fatalf("status: %d", rec.Code) + } + if got := rec.Header().Get(HeaderName); got != "proxy" { + t.Errorf("cache header = %q, want proxy (no-store should bypass cache)", got) + } + cached := filepath.Join(c.root, "dynamic.json") + if _, err := os.Stat(cached); !os.IsNotExist(err) { + t.Errorf("no-store response was cached") + } +} + +func TestServeHTTP_PathTraversalRejected(t *testing.T) { + called := false + c, _ := newTestCache(t, "cache", func(w http.ResponseWriter, r *http.Request) { + called = true + _, _ = w.Write([]byte("data")) + }) + rec := httptest.NewRecorder() + c.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/../etc/passwd", nil)) + // The upstream may still be called (the proxy doesn't gatekeep), but + // we MUST NOT cache to a path that escapes the root. + _ = called + root := c.root + parent := filepath.Dir(root) + if _, err := os.Stat(filepath.Join(parent, "etc", "passwd")); !os.IsNotExist(err) { + t.Error("path traversal wrote outside cache root") + } +} + +func TestServeHTTP_ForwardsErrorStatus(t *testing.T) { + c, _ := newTestCache(t, "cache", func(w http.ResponseWriter, r *http.Request) { + http.Error(w, "Forbidden", http.StatusForbidden) + }) + rec := httptest.NewRecorder() + c.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/secret.txt", nil)) + if rec.Code != http.StatusForbidden { + t.Errorf("status = %d, want 403", rec.Code) + } + cached := filepath.Join(c.root, "secret.txt") + if _, err := os.Stat(cached); !os.IsNotExist(err) { + t.Error("403 response was cached") + } +} + +func TestRevalidate_PurgesOn403(t *testing.T) { + root := t.TempDir() + if err := os.WriteFile(filepath.Join(root, "victim.txt"), []byte("cached"), 0o644); err != nil { + t.Fatalf("seed: %v", err) + } + upstream := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + http.Error(w, "Forbidden", http.StatusForbidden) + })) + defer upstream.Close() + c, err := New(config.Config{Root: root, Upstream: upstream.URL, Mode: "cache"}) + if err != nil { + t.Fatalf("New: %v", err) + } + c.revalidate("/victim.txt", time.Now()) + if _, err := os.Stat(filepath.Join(root, "victim.txt")); !os.IsNotExist(err) { + t.Error("revalidate did not purge after 403") + } +} + +func TestRevalidate_PurgesOn404(t *testing.T) { + root := t.TempDir() + if err := os.WriteFile(filepath.Join(root, "gone.txt"), []byte("cached"), 0o644); err != nil { + t.Fatalf("seed: %v", err) + } + upstream := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + http.NotFound(w, r) + })) + defer upstream.Close() + c, err := New(config.Config{Root: root, Upstream: upstream.URL, Mode: "cache"}) + if err != nil { + t.Fatalf("New: %v", err) + } + c.revalidate("/gone.txt", time.Now()) + if _, err := os.Stat(filepath.Join(root, "gone.txt")); !os.IsNotExist(err) { + t.Error("revalidate did not purge after 404") + } +} + +func TestRevalidate_NoPurgeOn200ButRefreshes(t *testing.T) { + root := t.TempDir() + old := []byte("old-content") + if err := os.WriteFile(filepath.Join(root, "fresh.txt"), old, 0o644); err != nil { + t.Fatalf("seed: %v", err) + } + // Set the file's mtime to an hour ago. + hourAgo := time.Now().Add(-time.Hour) + _ = os.Chtimes(filepath.Join(root, "fresh.txt"), hourAgo, hourAgo) + + upstream := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + _, _ = w.Write([]byte("new-content")) + })) + defer upstream.Close() + c, err := New(config.Config{Root: root, Upstream: upstream.URL, Mode: "cache"}) + if err != nil { + t.Fatalf("New: %v", err) + } + c.revalidate("/fresh.txt", hourAgo) + got, _ := os.ReadFile(filepath.Join(root, "fresh.txt")) + if string(got) != "new-content" { + t.Errorf("revalidate did not refresh: got %q", string(got)) + } +} + +func TestRevalidate_NoOpOn304(t *testing.T) { + root := t.TempDir() + original := []byte("original") + if err := os.WriteFile(filepath.Join(root, "still.txt"), original, 0o644); err != nil { + t.Fatalf("seed: %v", err) + } + upstream := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + // Always return 304; assume client sent If-Modified-Since. + if r.Header.Get("If-Modified-Since") == "" { + t.Errorf("revalidate did not send If-Modified-Since") + } + w.WriteHeader(http.StatusNotModified) + })) + defer upstream.Close() + c, err := New(config.Config{Root: root, Upstream: upstream.URL, Mode: "cache"}) + if err != nil { + t.Fatalf("New: %v", err) + } + c.revalidate("/still.txt", time.Now()) + got, _ := os.ReadFile(filepath.Join(root, "still.txt")) + if string(got) != "original" { + t.Errorf("304 caused content change: got %q", string(got)) + } +} + +func TestRangeRequest_Hit(t *testing.T) { + c, _ := newTestCache(t, "cache", func(w http.ResponseWriter, r *http.Request) { + w.Header().Set("Content-Type", "text/plain") + _, _ = w.Write([]byte("0123456789")) + }) + // Seed cache. + rec := httptest.NewRecorder() + c.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/data.txt", nil)) + if rec.Code != http.StatusOK { + t.Fatalf("seed: %d", rec.Code) + } + + // Range request. + rec2 := httptest.NewRecorder() + r2 := httptest.NewRequest(http.MethodGet, "/data.txt", nil) + r2.Header.Set("Range", "bytes=2-5") + c.ServeHTTP(rec2, r2) + if rec2.Code != http.StatusPartialContent { + t.Fatalf("range = %d, want 206", rec2.Code) + } + if rec2.Body.String() != "2345" { + t.Errorf("range body = %q", rec2.Body.String()) + } + if got := rec2.Header().Get("Content-Range"); !strings.HasPrefix(got, "bytes 2-5/") { + t.Errorf("Content-Range = %q", got) + } +} + +func TestServeHTTP_ConcurrentRequestsForSameURL(t *testing.T) { + // Stress the marker-once and tmpfile path with parallel misses. + var hits int32 + c, _ := newTestCache(t, "cache", func(w http.ResponseWriter, r *http.Request) { + atomic.AddInt32(&hits, 1) + _, _ = io.WriteString(w, "concurrent") + }) + var wg sync.WaitGroup + for i := 0; i < 8; i++ { + wg.Add(1) + go func() { + defer wg.Done() + rec := httptest.NewRecorder() + c.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/c.txt", nil)) + if rec.Code != http.StatusOK { + t.Errorf("status = %d", rec.Code) + } + if rec.Body.String() != "concurrent" { + t.Errorf("body = %q", rec.Body.String()) + } + }() + } + wg.Wait() + + // File should exist with the right content. + got, err := os.ReadFile(filepath.Join(c.root, "c.txt")) + if err != nil { + t.Fatalf("read: %v", err) + } + if string(got) != "concurrent" { + t.Errorf("cached body = %q", string(got)) + } +} + +func TestCachePathFor_Boundaries(t *testing.T) { + c, _ := newTestCache(t, "cache", func(w http.ResponseWriter, r *http.Request) {}) + cases := []struct { + urlPath string + ok bool + }{ + {"", false}, + {"/", false}, + {"/../etc/passwd", false}, + {"/foo/../bar", false}, + {"/foo/bar.txt", true}, + {"/" + MarkerFile, false}, + {"/Project/foo.txt", true}, + } + for _, tc := range cases { + _, ok := c.cachePathFor(tc.urlPath) + if ok != tc.ok { + t.Errorf("cachePathFor(%q) ok=%v, want %v", tc.urlPath, ok, tc.ok) + } + } +} diff --git a/zddc/internal/config/config.go b/zddc/internal/config/config.go index 62f9541..1422643 100644 --- a/zddc/internal/config/config.go +++ b/zddc/internal/config/config.go @@ -28,6 +28,16 @@ type Config struct { AccessLog string // --access-log / ZDDC_ACCESS_LOG — file path for tee'd JSON access log; empty = stderr only Insecure bool // --insecure / ZDDC_INSECURE=1 — opt out of safety checks (currently: allow start without a root .zddc, leaving the tree publicly accessible) NoAuth bool // --no-auth / ZDDC_NO_AUTH=1 — skip ACL enforcement entirely. This instance is NOT the security boundary; on master = "open" (anyone reads everything), on a client = "trust upstream's filtering, don't re-evaluate ACLs locally." + + // Client-mode flags. When Upstream is non-empty, this binary runs + // as a downstream proxy/cache/mirror against the named master. + // Root then becomes the cache directory rather than the served + // data root. Master-mode flags (apps, archive, opa, etc.) are + // ignored in client mode — see cmd/zddc-server/main.go. + Upstream string // --upstream / ZDDC_UPSTREAM — master URL (https://master.example.com); empty = run as master + Mode string // --mode / ZDDC_MODE — "proxy" (no disk persistence), "cache" (default; persist on access), "mirror" (cache + access-triggered subtree warmer; phase 3) + BearerFile string // --bearer-file / ZDDC_BEARER_FILE — path to a 0600 file containing the master-issued token to forward upstream + SkipTLSVerify bool // --skip-tls-verify / ZDDC_SKIP_TLS_VERIFY=1 — accept self-signed / untrusted upstream certs. Distinct from --no-auth; intended for dev/internal CA scenarios only. OPAURL string // --opa-url / ZDDC_OPA_URL — policy decider endpoint: "internal" (default), "http(s)://..." (real OPA via HTTP), or "unix:///..." (OPA via Unix socket) OPAFailOpen bool // --opa-fail-open / ZDDC_OPA_FAIL_OPEN=1 — when external OPA is unreachable, allow instead of deny (default: fail closed) OPACacheTTL time.Duration // --opa-cache-ttl / ZDDC_OPA_CACHE_TTL — external mode only: per-decision cache TTL. Default 1s. Set 0s to disable. @@ -89,6 +99,14 @@ func Load(args []string) (Config, error) { "Allow startup with no root .zddc file (the tree is then publicly accessible). Default: refuse to start.") noAuthFlag := fs.Bool("no-auth", os.Getenv("ZDDC_NO_AUTH") == "1", "Skip ACL enforcement entirely. On master: anyone reads everything (dev / trusted-LAN / public-read deployments). On client: trust upstream's filtering. Distinct from --insecure (which gates startup-without-.zddc). Default: enforce ACLs.") + upstreamFlag := fs.String("upstream", os.Getenv("ZDDC_UPSTREAM"), + "Master URL (e.g. https://master.example.com). When set, this binary runs as a downstream proxy/cache/mirror against the master; --root becomes the cache directory. Empty (default) = run as master.") + modeFlag := fs.String("mode", getEnv("ZDDC_MODE", "cache"), + "Client mode: \"proxy\" (forward upstream live, no disk persistence), \"cache\" (default; persist responses on access), \"mirror\" (phase 3). Ignored when --upstream is empty.") + bearerFileFlag := fs.String("bearer-file", os.Getenv("ZDDC_BEARER_FILE"), + "Path to a 0600 file containing the master-issued token forwarded as Authorization: Bearer to upstream. See /.tokens on the master to issue one. Ignored when --upstream is empty.") + skipTLSVerifyFlag := fs.Bool("skip-tls-verify", os.Getenv("ZDDC_SKIP_TLS_VERIFY") == "1", + "Accept self-signed / untrusted TLS certs from the upstream. Distinct from --no-auth. Intended for dev or internal-CA scenarios only.") opaURLFlag := fs.String("opa-url", getEnv("ZDDC_OPA_URL", "internal"), "Policy decider endpoint: \"internal\" (built-in Go evaluator, default), \"http(s)://host:port\", or \"unix:///path/to/socket\".") opaFailOpenFlag := fs.Bool("opa-fail-open", os.Getenv("ZDDC_OPA_FAIL_OPEN") == "1", @@ -157,6 +175,10 @@ func Load(args []string) (Config, error) { AccessLog: *accessLogFlag, Insecure: *insecureFlag, NoAuth: *noAuthFlag, + Upstream: *upstreamFlag, + Mode: *modeFlag, + BearerFile: *bearerFileFlag, + SkipTLSVerify: *skipTLSVerifyFlag, OPAURL: *opaURLFlag, OPAFailOpen: *opaFailOpenFlag, OPACacheTTL: *opaCacheTTLFlag, @@ -189,7 +211,14 @@ func Load(args []string) (Config, error) { // accessible to anonymous callers. The vast majority of operators do not // want that — and the few who do (a deliberately public archive) can pass // --insecure to acknowledge it. See zddc/README.md § Access control. - if !cfg.Insecure { + // + // Skipped in client mode (cfg.Upstream != ""): the cache directory + // starts empty by design, so a missing .zddc is not a security + // concern — the cache layer doesn't evaluate ACLs locally + // (upstream filtering is the boundary; --no-auth on a client + // formalizes that). The directory will fill in as files are + // fetched, and any cached .zddc files come straight from upstream. + if !cfg.Insecure && cfg.Upstream == "" { if _, err := os.Stat(filepath.Join(cfg.Root, ".zddc")); os.IsNotExist(err) { return Config{}, fmt.Errorf( "no %s/.zddc file found; the served tree would be publicly accessible to anonymous callers. "+ @@ -245,7 +274,12 @@ func Load(args []string) (Config, error) { // behind an authenticating reverse proxy. Refuse to start when binding // plain HTTP to a non-loopback interface unless the operator has // explicitly acknowledged the deployment shape. - if cfg.TLSMode == "none" && !isLoopbackAddr(cfg.Addr) && !*insecureDirectFlag { + // + // In client mode (Upstream set), the local instance never reads the + // email header to make decisions — auth is forwarded as a Bearer + // token to upstream and the local instance trusts upstream's + // filtering. So this check doesn't apply. + if cfg.Upstream == "" && cfg.TLSMode == "none" && !isLoopbackAddr(cfg.Addr) && !*insecureDirectFlag { return Config{}, fmt.Errorf( "--tls-cert=none binds plain HTTP to %q which trusts %s headers from any client; "+ "either use TLS (omit --tls-cert or supply a cert), bind to loopback (127.0.0.1: or [::1]:), "+ @@ -253,6 +287,25 @@ func Load(args []string) (Config, error) { cfg.Addr, cfg.EmailHeader) } + // Client-mode validation. Only enforced when --upstream is set; + // the same flags are silently ignored in master mode. + if cfg.Upstream != "" { + switch cfg.Mode { + case "proxy", "cache", "mirror": + // ok + case "": + cfg.Mode = "cache" + default: + return Config{}, fmt.Errorf("--mode must be \"proxy\", \"cache\", or \"mirror\"; got %q", cfg.Mode) + } + if !strings.HasPrefix(cfg.Upstream, "http://") && !strings.HasPrefix(cfg.Upstream, "https://") { + return Config{}, fmt.Errorf("--upstream %q must start with http:// or https://", cfg.Upstream) + } + if strings.HasSuffix(cfg.Upstream, "/") { + cfg.Upstream = strings.TrimRight(cfg.Upstream, "/") + } + } + return cfg, nil } @@ -279,6 +332,10 @@ func Usage(w io.Writer) { fs.Bool("insecure-direct", false, "Allow plain HTTP on non-loopback addresses.") fs.Bool("insecure", false, "Allow startup with no root .zddc file (publicly accessible). Default: refuse.") fs.Bool("no-auth", false, "Skip ACL enforcement entirely. On master: anyone reads everything. On client: trust upstream's filtering. Distinct from --insecure.") + fs.String("upstream", "", "Master URL — when set, run as a downstream proxy/cache/mirror; --root becomes the cache directory. Empty (default) = master.") + fs.String("mode", "cache", "Client mode: proxy / cache / mirror. Ignored when --upstream is empty.") + fs.String("bearer-file", "", "Path to a 0600 file holding the master-issued bearer token forwarded to upstream. Ignored when --upstream is empty.") + fs.Bool("skip-tls-verify", false, "Accept self-signed / untrusted upstream TLS certs. Distinct from --no-auth. Dev / internal-CA scenarios only.") fs.String("opa-url", "internal", "Policy decider: \"internal\", \"http(s)://...\", or \"unix:///...\".") fs.Bool("opa-fail-open", false, "External OPA: allow on transport error (default: deny / fail closed).") fs.Duration("opa-cache-ttl", time.Second, "External OPA: per-decision cache TTL (default 1s; 0 disables).")