Phase 3 + 4 live two-instance smoke tests against the synthetic
~/zddc-test-data fixture surfaced three real bugs that the unit
tests missed. All three are fixed in this commit.
1. walker: filenames with spaces/parens land on disk percent-encoded
walkSubtree was passing the URL-encoded child URL (built via
url.PathEscape) to fetchFileIfNeeded → cachePathFor, so a file
named "Foo (IFI) - Bar.md" landed at <root>/.../Foo%20%28IFI%29
%20-%20Bar.md on disk. Then purgeOrphans iterated os.ReadDir
(which sees the encoded names) and compared against upstreamNames
(decoded names from the listing JSON). Every fetched file was
classified as an orphan and immediately deleted: a 180-file walk
produced "fetched=180 purged=111" with only 70 files remaining.
Fix: walker now maintains two parallel path strings — dirURL
(URL-encoded for HTTP requests) and dirPath (decoded for disk
keys). fetchFileIfNeeded, fetchListing, persistOnly, and
purgeOrphans all take the decoded path. listingCachePathFor
gets dirPath too. Smoke confirmed: dirs=29 files=180 fetched=179
purged=0 (one file already cached from the user's GET that
triggered the walk).
2. outbox: replay loop sleeps 5min after eager startup pass
RunReplayLoop's idle-poll interval is 5min. After the eager
startup pass with 0 entries, the loop sleeps 5min — even if a
PUT-while-offline arrives 1 second later, replay won't fire for
~5 min. The cache returned 202 promptly but the queued write sat
on disk until either a 5min nap elapsed or another PUT happened.
Fix: Outbox gains a wake chan (buffered=1, drop-on-full).
Enqueue posts to it after writing meta.json. RunReplayLoop selects
on wake alongside the timer, so a new offline write triggers an
immediate replay attempt. Smoke confirmed: PUT queued at T+0,
master back at T+3, replay completes at T+3 (was previously a
30s wait through the timer-based poll).
3. master: PUT/DELETE didn't honor If-Unmodified-Since
The cache's outbox sends If-Unmodified-Since: <cached-mtime> on
replay so the master can reject conflicting writes with 412. The
master's checkIfMatch only evaluated If-Match (ETag-based), so
the cache's mtime-based precondition was silently ignored. Result:
an offline PUT staged before an external mod would clobber the
newer external content on replay — silent data loss in the exact
scenario the outbox is designed to detect.
Fix: checkIfMatch now also evaluates If-Unmodified-Since per
RFC 7232 §3.4, returning 412 when the file's current mtime is
strictly later than the header value (1-second resolution to
match HTTP-Date precision). Smoke confirmed: cache GET → external
mod via direct file write → cache offline PUT → master back →
replay sends IUS → master 412 → outbox entry renamed to
<id>.conflict-<RFC3339>/ → master content preserved (the
external mod, not the stale offline write).
Also added an info-level "outbox: replay attempt" log to tryReplay
so an operator watching the cache logs sees the replay loop is
alive even when every entry defers (transport error). Previously
the loop was silent unless a replay actually completed (200) or
conflicted (412).
go vet + go test ./... + go test -race ./internal/{cache,auth,handler}/...
all green. Synthetic ~/zddc-test-data fixture (553 files, 144 PDFs)
exercises the walker against realistic ZDDC filenames including
spaces, parens, and accented characters that the unit tests'
"a.txt" / "b.txt" inputs never hit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PUT / POST / DELETE in client mode now work end-to-end. Online: the
cache layer forwards to upstream and (on success) drops any cached
entry for the path so the next read fetches fresh. PUT/DELETE include
If-Unmodified-Since derived from the cached file's mtime so the master
can reject conflicting writes with 412 Precondition Failed.
When upstream is unreachable, the request is captured in the outbox
at <root>/.zddc-outbox/<id>/ — directory per queued write, mode 0700,
containing meta.json (method, RawURI, Content-Type, base mtime,
queued-at) and body.bin (request body, capped at 256 MiB). The client
gets 202 Accepted + X-ZDDC-Cache: queued and a JSON envelope.
A background replay loop started by runClient processes the queue:
- 2xx → delete entry; drop cached path so next read fetches fresh
- 412 → rename to <id>.conflict-<RFC3339>/ for manual reconciliation
(body + meta intact for inspection or re-submit)
- 4xx other → drop (retry won't help; logged at WARN)
- 5xx / transport error → leave for next pass
Replay schedule: eager at startup, then 30s while pending falling
back to 5min while idle. Loop honors graceful-shutdown context.
Disabled in --mode=proxy (proxy persists nothing by design — offline
writes return 503 instead of queueing).
Outbox IDs are <unix-nano-base16>-<hex-random> so lex-sort = queue
order; concurrent enqueues never collide. Conflict-rename appends a
4-char random suffix on the unlikely same-second collision.
The local cache is intentionally not updated for offline writes:
until upstream confirms the user reads still see the upstream-cached
version (or 503 if uncached). Trade-off: no "did my queued write
actually win?" ambiguity, at the cost of not seeing one's own
offline edits immediately. Phase 5 will surface .conflict-<ts>/
directories in browse views.
Tests (20 new in outbox_test.go, 5 new in cache_test.go covering
the write path): NewOutbox creates 0700 dir, Enqueue persists meta
+ body, Pending returns lex-sorted entries excluding conflicts,
Replay deletes on 2xx / renames on 412 / leaves on transport error
/ leaves on 5xx / drops on 4xx-other, IUS sent only for PUT/DELETE
with base mtime, query string preserved, ServeHTTP online write
forwards + evicts cache, ServeHTTP offline write queues with 202,
ServeHTTP offline + no outbox returns 503, ServeHTTP PUT sends IUS
from cached mtime, oversize body rejected, IDs lex-sortable,
RunReplayLoop stops on context cancel, concurrent Enqueue 30×
no collisions. Full suite + go vet clean.
Doc updates: zddc/README.md gains a "Writes (online + offline
outbox)" subsection covering both paths and replay outcomes;
"What client mode is NOT, yet" now lists only conflict UI and
multi-tenancy. AGENTS.md client-mode pipeline gains writes +
mirror-mode bullets. ARCHITECTURE.md adds a "Writes: outbox +
offline replay" subsection with the trade-off rationale and the
phase-5-deferred conflict UI hand-off.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
--mode mirror layers an access-triggered walker on top of the cache
pipeline. When an incoming request's URL falls under one of the
configured --mirror-subtree paths, the scheduler kicks off a recursive
walk of that subtree iff (a) no walk for that subtree is in flight and
(b) now - last_walk_at >= --mirror-min-interval (default 1h). Walks
run in a goroutine; the user's request never blocks on scheduling.
Why access-triggered: a naive "walk on a fixed timer" would produce
thundering-herd polls on a master from many vendor mirrors most of
which are idle most of the time. Demand-triggering means idle mirrors
generate zero upstream traffic until someone hits them; active
mirrors stay current as a side effect of normal use.
The walk:
1. Recursively fetches JSON listings under the subtree, persisting
each at <dir>/.zddc-listing.json so directory browsing works
offline for walked subtrees.
2. For each file, fires a conditional If-Modified-Since GET (bounded
parallelism; default 4 concurrent) — 304 no-op, 200 overwrites,
403/404 purges the local cache.
3. After enumeration, per-directory orphan purge: local files absent
from upstream's filtered listing are removed (handles upstream
deletes + ACL revocations).
State persists at <root>/.zddc-mirror-state.json as
{subtrees: {<path>: {last_walk_at}}}. In-flight tracking is in-memory
only — a crash mid-walk lets the next access retry without manual
cleanup. Subtree path matching is longest-prefix-wins; "/" is a
catch-all (full mirror, the default when --mode=mirror is set without
explicit --mirror-subtree).
The cache layer also gained directory-listing caching (independent of
mirror mode but enabled by it). Directories are now stored at
<dir>/.zddc-listing.<html|json> sidecars, varied by Accept header.
Hit/miss/offline semantics mirror the file pipeline. Phase 2's
limitation that directories always proxied live (no offline browse)
is now resolved for any directory the user has visited or that mirror
mode has walked.
Mirror scope falls out of auth: the walker uses the local instance's
bearer, so it sees exactly what the user can see at upstream. Admin
bearer → full mirror; vendor bearer → vendor's permitted subtree;
no code distinguishes the cases.
New flags (also as ZDDC_* env vars), ignored when --mode != mirror:
- --mirror-subtree <csv> — repeatable subtrees (comma-separated);
empty + --mode=mirror = "/" (full mirror)
- --mirror-min-interval <duration> — default 1h
Tests (15 new in walker_test.go, 3 new in cache_test.go): subtree
normalization, longest-prefix matching, root-as-catch-all, walk
fetches all files in scope, out-of-scope URLs are no-op, rate-
limiting prevents double-walks within min-interval, walks re-fire
after interval elapses, orphan purge removes local-only files,
state file survives restart, concurrent triggers don't double-walk,
end-to-end ServeHTTP-kicks-mirror-on-access, listing format varies
by Accept, listing offline serves stale, persisted state atomic
write + corrupt-input handling. Full suite + go vet clean.
Doc updates: zddc/README.md flags table gains the two new entries
plus a "Mirror mode (access-triggered subtree walker)" subsection
with trigger semantics and properties; the "What client mode is NOT,
yet" list shrinks accordingly. AGENTS.md env-var table gains the
two new entries. ARCHITECTURE.md "Master + proxy/cache/mirror"
section now documents the walker scheduler / walk algorithm / state
file in a "Mirror walker (access-triggered)" subsection.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
zddc-server can now run as a downstream client of another zddc-server.
Set --upstream <url> and the master-side machinery (archive index, apps
server, watcher, OPA decider, ACL middleware, token store) is bypassed
entirely; cmd/zddc-server/main.go short-circuits to runClient(cfg)
which uses zddc/internal/cache/Cache as the entire request handler.
Three modes via --mode <proxy|cache|mirror>:
- proxy: forward upstream live, no disk persistence
- cache (default): persist responses on access; subsequent hits serve
from disk + background If-Modified-Since revalidate
- mirror: accepted but currently behaves like cache; the access-
triggered walker lands in phase 3
Cache directory layout is intentionally a normal ZDDC root: a file
fetched from <master>/foo/bar.txt is stored at <root>/foo/bar.txt with
no sidecar metadata. The local file's mtime is set to the upstream's
Last-Modified header so revalidation reflects the master's notion of
file age, not local fetch time. Running zddc-server --root <cache-dir>
without --upstream serves the cached files as a plain master — useful
for portable offline snapshots. A small .zddc-upstream marker is
written once on first persist for provenance.
Pipeline (GET/HEAD only — writes deferred):
- Hit → http.ServeContent serves directly (range-aware, 304-aware) +
background revalidate (304 no-op, 200 overwrite, 403/404 purge)
- Miss → forward to upstream with the configured bearer; tee response
body to client + tmp-file atomically renamed into the cache
- Network error + cached → serve stale + X-ZDDC-Cache: offline
- Network error + no cache → 503 + X-ZDDC-Cache: offline
- Directories always proxy live (no listing cache yet — phase 3)
- Cache-Control: no-store / private and non-200 responses bypass cache
Range requests work end-to-end (Range/If-Range headers forwarded on
miss; http.ServeContent handles them natively on hit). Hop-by-hop
headers per RFC 7230 §6.1 are dropped from forwarded responses.
New flags (also as ZDDC_* env vars), all ignored when --upstream is
empty (so master deployments are untouched):
- --upstream <url>
- --mode proxy|cache|mirror (default cache)
- --bearer-file <path> (0600 file with the master-issued token)
- --skip-tls-verify (separate from --no-auth; for self-signed dev)
Validation: --upstream must be http(s)://...; trailing / is trimmed.
Mode validated to one of the three known values. The startup
no-root-.zddc check is skipped in client mode (the cache directory
starts empty by design). The plain-HTTP-on-non-loopback check is also
skipped (the local instance never reads the email header to decide
anything; auth is forwarded to upstream as a Bearer).
Tests: zddc/internal/cache/cache_test.go runs httptest.NewServer as
the upstream and covers miss-then-hit, proxy-mode-no-persist,
directory-never-cached, HEAD-no-body, offline-with-cache,
offline-no-cache → 503, bearer forwarding, query-string preservation,
no-store bypass, path-traversal rejection, error-status forwarding,
revalidate-on-403/404/200/304, range-on-hit, concurrent-same-URL,
cache-path boundary cases. 23 new tests, full suite + go vet clean.
Live two-instance smoke verified: master at 127.0.0.1:18443, client
at :18444 with --mode cache, miss→hit→hit transitions work, file
materialises under cache root with parent dirs created, marker file
written once, range-on-hit returns 206, master sees background 304s
on every hit, killing master leaves cached files serving from disk
and never-cached files returning 503 + offline header.
Doc updates: zddc/README.md gains a "Client mode" section with the
modes table, flag reference, pipeline summary, two-instance recipe,
and explicit list of phase-2 limitations; AGENTS.md adds the four
new env vars to the reference table and a "Client mode" subsection
with smoke-test recipe and a pointer to the cache package;
ARCHITECTURE.md adds "Master + proxy/cache/mirror" before "Bearer
token issuance," covering the topology, the persist/warm switches,
the cache-IS-a-ZDDC-root invariant, the request pipeline, and the
v1-out-of-scope multi-tenancy note; CLAUDE.md's zddc/ entry
expanded to mention both deployment shapes so future agents pick it
up by default.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>