ZDDC/zddc/internal/handler/directory.go
ZDDC a01315fd00 feat(server): reference Rego, parity test, decision cache, listing ETags
Phase 2 enhancements to the policy decider, plus listing-level ETags
that benefit every deployment regardless of decider mode.

Reference Rego policy
---------------------
internal/policy/rego/access.rego mirrors InternalDecider's semantics
exactly — bottom-up walk, deny-first within a level, default-deny when
HasAnyFile=true, glob matching with @-boundary semantics (special-cased
bare "*" because OPA's glob.match treats empty delimiters
inconsistently for that pattern).

Embedded into the binary via go:embed; --print-rego dumps it to stdout
so federal customers standing up an external OPA can use it as a
parity-tested baseline:

    zddc-server --print-rego > /etc/opa/policies/zddc-access.rego

Parity test runner
------------------
parity_test.go imports the OPA Go module as a TEST-ONLY dependency
(github.com/open-policy-agent/opa@v0.70.0). Every fixture from the
internal Go evaluator's test set runs through both implementations;
any divergence fails CI. The test-only import means production
binaries (built by `go build ./cmd/zddc-server`) stay OPA-free —
release-flag binary size unchanged at ~13 MB.

The parity test caught a real bug on first run: bare "*" patterns
didn't match through OPA's glob.match with empty delimiters. Fixed
in access.rego with a special-case rule. This is exactly the kind of
subtle drift the parity guard exists to catch.

External-mode decision cache
----------------------------
HTTPDecider is now wrapped in a cachingDecider with a default 1s TTL.
Bursty patterns like .archive listings (one OPA round-trip per entry
before, one per (email, decision-input) tuple per TTL window after)
amortize cleanly. Verified: 20 identical /D/ requests produce 1 OPA
hit with cache, 40 hits without (each listing makes 2 ACL queries).

ZDDC_OPA_CACHE_TTL knob (default 1s) lets operators tune. 0 disables.
1s matches the fsnotify watcher debounce window — staleness is
bounded the same way other policy-edit propagation already is.
Internal mode unchanged; the in-process Go evaluator is already
cheaper than a cache lookup would be.

Listing ETags
-------------
GET / (project list) and GET /<dir>/ (directory listing JSON) now
carry content-hash ETag + Cache-Control: private, max-age=0,
must-revalidate. SHA-256 of the rendered JSON, truncated to 16 hex
chars (64 bits — collision risk on a listing of any realistic size
is vanishingly small).

Server-side caching deliberately not added: it would require
mtime-based invalidation, and Azure Files SMB mounts (a common
deployment substrate) don't support fsnotify reliably. The
content-hash ETag delivers the bandwidth savings (304 on identical
fetches) without depending on watcher correctness — the hash is the
actual response, so it can't lie about staleness regardless of
underlying watcher behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 17:46:24 -05:00

173 lines
6.2 KiB
Go

package handler
import (
"crypto/sha256"
"encoding/hex"
"encoding/json"
"log/slog"
"net/http"
"os"
"path/filepath"
"strings"
"codeberg.org/VARASYS/ZDDC/zddc/internal/apps"
"codeberg.org/VARASYS/ZDDC/zddc/internal/config"
appfs "codeberg.org/VARASYS/ZDDC/zddc/internal/fs"
"codeberg.org/VARASYS/ZDDC/zddc/internal/policy"
"codeberg.org/VARASYS/ZDDC/zddc/internal/zddc"
)
// listingETag returns a hex-encoded SHA-256 prefix of the rendered JSON
// listing body. Truncated to 16 chars (64 bits) — collisions on a
// listing of any realistic size are vanishingly unlikely, and the short
// header keeps the wire footprint trim.
func listingETag(body []byte) string {
h := sha256.Sum256(body)
return hex.EncodeToString(h[:8])
}
// safeJoin joins fsRoot and relPath, then verifies the result is under fsRoot.
// Returns ("", false) if relPath would escape fsRoot.
func safeJoin(fsRoot, relPath string) (string, bool) {
abs := filepath.Join(fsRoot, filepath.FromSlash(relPath))
if !strings.HasPrefix(abs, fsRoot+string(filepath.Separator)) && abs != fsRoot {
return "", false
}
return abs, true
}
// ServeDirectory handles a request for a directory path.
// If Accept: application/json → return Caddy-compatible JSON listing.
// Otherwise → return minimal HTML.
func ServeDirectory(cfg config.Config, w http.ResponseWriter, r *http.Request) {
urlPath := r.URL.Path
if !strings.HasSuffix(urlPath, "/") {
http.Redirect(w, r, urlPath+"/", http.StatusMovedPermanently)
return
}
email := EmailFromContext(r)
decider := DeciderFromContext(r)
ctx := r.Context()
// Compute relative dir path (no leading or trailing slash)
dirPath := strings.TrimPrefix(urlPath, "/")
dirPath = strings.TrimSuffix(dirPath, "/")
// ACL check on this directory itself.
// Bypassed at the root path: the landing page is a public project
// picker. Per-project filtering inside fs.ListDirectory still hides
// directories the caller can't reach.
absDir, ok := safeJoin(cfg.Root, dirPath)
if !ok {
http.Error(w, "Not Found", http.StatusNotFound)
return
}
chain, err := zddc.EffectivePolicy(cfg.Root, absDir)
if err != nil {
slog.Warn("ACL policy error", "path", absDir, "err", err)
}
isRoot := dirPath == ""
if !isRoot {
if allowed, _ := policy.AllowFromChain(ctx, decider, chain, email, urlPath); !allowed {
http.Error(w, "Forbidden", http.StatusForbidden)
return
}
}
accept := r.Header.Get("Accept")
// For HTML requests, serve index.html if present (landing page convention)
if !strings.Contains(accept, "application/json") {
indexPath := filepath.Join(absDir, "index.html")
if info, err := os.Stat(indexPath); err == nil && !info.IsDir() {
ServeFile(w, r, indexPath)
return
}
}
// Build base URL for listing entries
baseURL := urlPath // relative URLs suffice for JSON listings
entries, err := appfs.ListDirectory(ctx, decider, cfg.Root, dirPath, email, baseURL)
if err != nil {
if os.IsNotExist(err) {
http.Error(w, "Not Found", http.StatusNotFound)
} else {
slog.Error("listing directory", "path", dirPath, "err", err)
http.Error(w, "Internal Server Error", http.StatusInternalServerError)
}
return
}
// Vary: Accept is critical — the same URL serves either the JSON
// listing or the embedded browse.html depending on the Accept
// header. Without Vary, browsers/CDNs cache one response and
// serve it for the other Accept value, breaking browse.html's
// auto-detect (which fetches the same URL with Accept: JSON).
w.Header().Set("Vary", "Accept")
if strings.Contains(accept, "application/json") {
// Content-hash ETag on the listing payload. Re-fetched on every
// request (the cascade is walked, ACL filter applied, JSON
// rendered, hashed) — that's the same server work the previous
// no-cache version did. The win is on the *response*: identical
// listings (e.g. the same vendor refreshing their archive page)
// short-circuit to 304 with no body.
//
// Crucially, this scheme tolerates unreliable filesystem
// watching (Azure SMB, network shares with delayed inotify):
// the ETag is the actual response hash, not a watcher-derived
// invalidation token, so it can never lie about content.
body, err := json.Marshal(entries)
if err != nil {
slog.Error("encoding directory listing", "err", err)
http.Error(w, "Internal Server Error", http.StatusInternalServerError)
return
}
etag := `"` + listingETag(body) + `"`
w.Header().Set("Content-Type", "application/json")
w.Header().Set("ETag", etag)
w.Header().Set("Cache-Control", "private, max-age=0, must-revalidate")
if match := r.Header.Get("If-None-Match"); match != "" && match == etag {
w.WriteHeader(http.StatusNotModified)
return
}
_, _ = w.Write(body)
return
}
// Browser HTML fallback: serve the embedded `browse` tool. It's a
// single-file SPA whose autoDetectServerMode loads the JSON listing
// for the current directory and renders it as a sortable, filterable
// tree. Same bytes that get served at /<dir>/browse.html — but at
// the bare directory URL too, so any zddc-served folder presents a
// usable file browser to anyone who navigates to it.
body := apps.EmbeddedBytes("browse")
if len(body) == 0 {
// Bootstrap state: a fresh build hasn't populated browse.html
// into the embed yet. Fall through to JSON for clients that
// will still parse it.
w.Header().Set("Content-Type", "application/json")
w.Header().Set("Cache-Control", "no-cache")
if err := json.NewEncoder(w).Encode(entries); err != nil {
slog.Error("encoding directory listing (no-embed fallback)", "err", err)
}
return
}
// ETag + max-age=0 + must-revalidate: every request re-validates and
// gets a 304 unless the binary has been redeployed (the ETag is a
// content hash, computed once at startup and memoized in apps.embed).
// Saves re-transmitting ~230 KB of browse.html on every page load
// while still picking up redeploys immediately.
etag := `"` + apps.EmbeddedETag("browse") + `"`
w.Header().Set("ETag", etag)
w.Header().Set("Cache-Control", "public, max-age=0, must-revalidate")
w.Header().Set("Content-Type", "text/html; charset=utf-8")
w.Header().Set("X-ZDDC-Source", "embedded:browse")
if match := r.Header.Get("If-None-Match"); match != "" && match == etag {
w.WriteHeader(http.StatusNotModified)
return
}
_, _ = w.Write(body)
}