ZDDC/zddc/internal/apps/fetch.go
ZDDC 8b6a2dc3e3 feat(zddc-server): apps fetch+cache subsystem with cascade overrides
Adds internal/apps/ package serving the five tool HTMLs at virtual paths
based on the surrounding folder name convention:

  archive      every directory (multi-project, project, archive, vendor)
  classifier   any Incoming/Working/Staging directory and subtree
  mdedit       any Working directory and subtree
  transmittal  any Staging directory and subtree
  landing      only at deployment root

The current-stable build of every tool is //go:embed'd into the binary
at compile time — that's the default with zero config. Operators
override per-directory via .zddc apps: entries; closer-to-leaf wins.

Spec syntax (in any apps: value):

  stable / beta / alpha / :stable          channel
  v0.0.4 / v0.0 / v0 / :v0.0.4              version
  https://my-mirror/releases                URL prefix only
  https://my-mirror/releases:beta           URL prefix + channel
  https://my-fork/archive.html              terminal full URL
  ./local.html / /abs/path.html             terminal local path

The special apps.default key provides a baseline URL prefix and channel
inherited by any app not overridden per-name. Per-axis cascade: a deeper
.zddc can override the URL, the channel, or both.

Cascade walks root→leaf; default applies first at each level, then the
per-app entry. Terminal sources (paths and full .html URLs) short-circuit
composition; deeper non-terminal entries override parent terminals.

URL sources fetch once on first request and cache forever in
<ZDDC_ROOT>/_app/<host>/<path> — different upstreams with the same
filename stay distinct. No background refresh, no SHA-256 verification:
operators delete the cache file to force a refetch. Concurrent misses
for the same source dedupe via a 30-line hand-rolled singleflight.

Per-request override: any user can append ?v=<spec> to a tool URL
(e.g. ?v=beta, ?v=v0.0.4, ?v=:alpha, ?v=https://mirror/releases:beta)
to ask for a different build for one request. Security: ?v= serves
ONLY versions already in the cache (cache miss returns 404; path
sources are rejected outright with 400). Users cannot trigger
arbitrary upstream fetches via crafted URLs.

Failed URL fetches (network down, 5xx) fall back to embedded with a
one-time WARN log. The X-ZDDC-Source response header reports what
served: fetch:URL / cache:URL / path:/abs / embedded:<app>@<build>.

Wire-in (cmd/zddc-server/main.go): dispatch routes <dir>/<app>.html
through apps.MatchAppHTML + AppAvailableAt + apps.Server.Serve when
no real file exists. Direct URL access to /_app/... is blocked at
the dispatch layer — cached files must go through the apps resolver
so they get correct Content-Type and ACL gating.

Schema (internal/zddc/file.go): ZddcFile gains Apps map[string]string
for cascade overrides. Validator (internal/zddc/validate.go) accepts
the special "default" key alongside the five canonical app names and
all spec forms.

Removes ZDDC_APPS_* env vars (no admin UI, no refresh interval, no
upstream allow-list — the simpler model has fewer knobs).

40+ unit tests across the new package: parser shapes, cascade
resolution with default+per-app interactions, terminal short-circuit
semantics, ?v= cache-only enforcement, embedded fallback, atomic
cache writes, singleflight dedup. Plus end-to-end dispatch tests in
cmd/zddc-server/main_test.go.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:25:25 -05:00

102 lines
2.8 KiB
Go

package apps
import (
"context"
"fmt"
"io"
"log/slog"
"net/http"
"sync"
"time"
)
// Fetcher pulls URL sources once, caches the body forever, and serves
// from cache on subsequent calls. Path sources don't go through here —
// the handler reads the file directly.
//
// Concurrent calls for the same URL dedupe via singleflight. There is no
// background refresh, no conditional GET, no SHA-256 verification.
type Fetcher struct {
Cache *Cache
Client *http.Client
Logger *slog.Logger
sf singleflightGroup
embeddedFails sync.Map // url → struct{} (rate-limit "fell back to embedded" warnings)
}
// NewFetcher returns a Fetcher with sensible defaults: 10s timeout, no
// redirects (ops must point at the final URL).
func NewFetcher(cache *Cache, logger *slog.Logger) *Fetcher {
if logger == nil {
logger = slog.Default()
}
return &Fetcher{
Cache: cache,
Logger: logger,
Client: &http.Client{
Timeout: 10 * time.Second,
CheckRedirect: func(*http.Request, []*http.Request) error {
return http.ErrUseLastResponse
},
},
}
}
// Fetch returns the body for url. If the cache already has it, returns
// the cached bytes immediately. Otherwise fetches, caches, and returns.
// All concurrent requests for the same URL share one outbound fetch.
func (f *Fetcher) Fetch(ctx context.Context, urlStr string) ([]byte, error) {
if f.Cache != nil {
if body, err := f.Cache.Read(urlStr); err == nil {
return body, nil
}
}
val, err := f.sf.Do(urlStr, func() (any, error) {
return f.fetchOnce(ctx, urlStr)
})
if err != nil {
return nil, err
}
return val.([]byte), nil
}
func (f *Fetcher) fetchOnce(ctx context.Context, urlStr string) ([]byte, error) {
req, err := http.NewRequestWithContext(ctx, http.MethodGet, urlStr, nil)
if err != nil {
return nil, err
}
resp, err := f.Client.Do(req)
if err != nil {
return nil, err
}
defer resp.Body.Close()
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
return nil, fmt.Errorf("upstream %s returned HTTP %d", urlStr, resp.StatusCode)
}
const maxBytes = 25 * 1024 * 1024
body, err := io.ReadAll(io.LimitReader(resp.Body, maxBytes+1))
if err != nil {
return nil, err
}
if int64(len(body)) > maxBytes {
return nil, fmt.Errorf("response from %s exceeds %d bytes", urlStr, maxBytes)
}
if f.Cache != nil {
if err := f.Cache.Write(urlStr, body); err != nil {
f.Logger.Warn("cache write failed; serving from response anyway",
"url", urlStr, "err", err)
}
}
return body, nil
}
// LogEmbeddedFallback emits a one-time warning when the embedded fallback
// is used for a particular source URL. Rate-limited per URL.
func (f *Fetcher) LogEmbeddedFallback(app, urlStr string, reason error) {
if _, loaded := f.embeddedFails.LoadOrStore(urlStr, struct{}{}); loaded {
return
}
f.Logger.Warn("serving embedded fallback for app HTML",
"app", app, "url", urlStr, "reason", reason)
}