Phase 3 + 4 live two-instance smoke tests against the synthetic
~/zddc-test-data fixture surfaced three real bugs that the unit
tests missed. All three are fixed in this commit.
1. walker: filenames with spaces/parens land on disk percent-encoded
walkSubtree was passing the URL-encoded child URL (built via
url.PathEscape) to fetchFileIfNeeded → cachePathFor, so a file
named "Foo (IFI) - Bar.md" landed at <root>/.../Foo%20%28IFI%29
%20-%20Bar.md on disk. Then purgeOrphans iterated os.ReadDir
(which sees the encoded names) and compared against upstreamNames
(decoded names from the listing JSON). Every fetched file was
classified as an orphan and immediately deleted: a 180-file walk
produced "fetched=180 purged=111" with only 70 files remaining.
Fix: walker now maintains two parallel path strings — dirURL
(URL-encoded for HTTP requests) and dirPath (decoded for disk
keys). fetchFileIfNeeded, fetchListing, persistOnly, and
purgeOrphans all take the decoded path. listingCachePathFor
gets dirPath too. Smoke confirmed: dirs=29 files=180 fetched=179
purged=0 (one file already cached from the user's GET that
triggered the walk).
2. outbox: replay loop sleeps 5min after eager startup pass
RunReplayLoop's idle-poll interval is 5min. After the eager
startup pass with 0 entries, the loop sleeps 5min — even if a
PUT-while-offline arrives 1 second later, replay won't fire for
~5 min. The cache returned 202 promptly but the queued write sat
on disk until either a 5min nap elapsed or another PUT happened.
Fix: Outbox gains a wake chan (buffered=1, drop-on-full).
Enqueue posts to it after writing meta.json. RunReplayLoop selects
on wake alongside the timer, so a new offline write triggers an
immediate replay attempt. Smoke confirmed: PUT queued at T+0,
master back at T+3, replay completes at T+3 (was previously a
30s wait through the timer-based poll).
3. master: PUT/DELETE didn't honor If-Unmodified-Since
The cache's outbox sends If-Unmodified-Since: <cached-mtime> on
replay so the master can reject conflicting writes with 412. The
master's checkIfMatch only evaluated If-Match (ETag-based), so
the cache's mtime-based precondition was silently ignored. Result:
an offline PUT staged before an external mod would clobber the
newer external content on replay — silent data loss in the exact
scenario the outbox is designed to detect.
Fix: checkIfMatch now also evaluates If-Unmodified-Since per
RFC 7232 §3.4, returning 412 when the file's current mtime is
strictly later than the header value (1-second resolution to
match HTTP-Date precision). Smoke confirmed: cache GET → external
mod via direct file write → cache offline PUT → master back →
replay sends IUS → master 412 → outbox entry renamed to
<id>.conflict-<RFC3339>/ → master content preserved (the
external mod, not the stale offline write).
Also added an info-level "outbox: replay attempt" log to tryReplay
so an operator watching the cache logs sees the replay loop is
alive even when every entry defers (transport error). Previously
the loop was silent unless a replay actually completed (200) or
conflicted (412).
go vet + go test ./... + go test -race ./internal/{cache,auth,handler}/...
all green. Synthetic ~/zddc-test-data fixture (553 files, 144 PDFs)
exercises the walker against realistic ZDDC filenames including
spaces, parens, and accented characters that the unit tests'
"a.txt" / "b.txt" inputs never hit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>