ZDDC/zddc/internal/config/config.go
ZDDC a01315fd00 feat(server): reference Rego, parity test, decision cache, listing ETags
Phase 2 enhancements to the policy decider, plus listing-level ETags
that benefit every deployment regardless of decider mode.

Reference Rego policy
---------------------
internal/policy/rego/access.rego mirrors InternalDecider's semantics
exactly — bottom-up walk, deny-first within a level, default-deny when
HasAnyFile=true, glob matching with @-boundary semantics (special-cased
bare "*" because OPA's glob.match treats empty delimiters
inconsistently for that pattern).

Embedded into the binary via go:embed; --print-rego dumps it to stdout
so federal customers standing up an external OPA can use it as a
parity-tested baseline:

    zddc-server --print-rego > /etc/opa/policies/zddc-access.rego

Parity test runner
------------------
parity_test.go imports the OPA Go module as a TEST-ONLY dependency
(github.com/open-policy-agent/opa@v0.70.0). Every fixture from the
internal Go evaluator's test set runs through both implementations;
any divergence fails CI. The test-only import means production
binaries (built by `go build ./cmd/zddc-server`) stay OPA-free —
release-flag binary size unchanged at ~13 MB.

The parity test caught a real bug on first run: bare "*" patterns
didn't match through OPA's glob.match with empty delimiters. Fixed
in access.rego with a special-case rule. This is exactly the kind of
subtle drift the parity guard exists to catch.

External-mode decision cache
----------------------------
HTTPDecider is now wrapped in a cachingDecider with a default 1s TTL.
Bursty patterns like .archive listings (one OPA round-trip per entry
before, one per (email, decision-input) tuple per TTL window after)
amortize cleanly. Verified: 20 identical /D/ requests produce 1 OPA
hit with cache, 40 hits without (each listing makes 2 ACL queries).

ZDDC_OPA_CACHE_TTL knob (default 1s) lets operators tune. 0 disables.
1s matches the fsnotify watcher debounce window — staleness is
bounded the same way other policy-edit propagation already is.
Internal mode unchanged; the in-process Go evaluator is already
cheaper than a cache lookup would be.

Listing ETags
-------------
GET / (project list) and GET /<dir>/ (directory listing JSON) now
carry content-hash ETag + Cache-Control: private, max-age=0,
must-revalidate. SHA-256 of the rendered JSON, truncated to 16 hex
chars (64 bits — collision risk on a listing of any realistic size
is vanishingly small).

Server-side caching deliberately not added: it would require
mtime-based invalidation, and Azure Files SMB mounts (a common
deployment substrate) don't support fsnotify reliably. The
content-hash ETag delivers the bandwidth savings (304 on identical
fetches) without depending on watcher correctness — the hash is the
actual response, so it can't lie about staleness regardless of
underlying watcher behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 17:46:24 -05:00

337 lines
15 KiB
Go

package config
import (
"errors"
"flag"
"fmt"
"io"
"net"
"os"
"path/filepath"
"strings"
"time"
)
// Config holds all runtime configuration. Each field can be set via a
// command-line flag (--<name>) or environment variable (ZDDC_<NAME>);
// the flag takes precedence when both are present.
type Config struct {
Root string // --root / ZDDC_ROOT — served file tree (default: CWD)
Addr string // --addr / ZDDC_ADDR — bind address (default :8443)
TLSCert string // --tls-cert / ZDDC_TLS_CERT — PEM cert path; "none" = plain HTTP; empty = self-signed
TLSKey string // --tls-key / ZDDC_TLS_KEY — PEM key path
TLSMode string // computed: none/selfsigned/provided
LogLevel string // --log-level / ZDDC_LOG_LEVEL — debug/info/warn/error (default info)
IndexPath string // --index-path / ZDDC_INDEX_PATH — virtual archive prefix (default .archive)
EmailHeader string // --email-header / ZDDC_EMAIL_HEADER — auth header name (default X-Auth-Request-Email)
CORSOrigins []string // --cors-origin / ZDDC_CORS_ORIGIN — comma-separated allowlist; default empty (CORS disabled); explicit value enables
AccessLog string // --access-log / ZDDC_ACCESS_LOG — file path for tee'd JSON access log; empty = stderr only
Insecure bool // --insecure / ZDDC_INSECURE=1 — opt out of safety checks (currently: allow start without a root .zddc, leaving the tree publicly accessible)
OPAURL string // --opa-url / ZDDC_OPA_URL — policy decider endpoint: "internal" (default), "http(s)://..." (real OPA via HTTP), or "unix:///..." (OPA via Unix socket)
OPAFailOpen bool // --opa-fail-open / ZDDC_OPA_FAIL_OPEN=1 — when external OPA is unreachable, allow instead of deny (default: fail closed)
OPACacheTTL time.Duration // --opa-cache-ttl / ZDDC_OPA_CACHE_TTL — external mode only: per-decision cache TTL. Default 1s. Set 0s to disable.
}
// ErrHelpRequested is returned by Load when --help is passed; the caller
// should print Usage() to stderr and exit 0.
var ErrHelpRequested = errors.New("help requested")
// ErrVersionRequested is returned by Load when --version is passed; the
// caller should print version info and exit 0.
var ErrVersionRequested = errors.New("version requested")
// Load reads configuration from CLI flags + environment variables.
//
// Precedence (highest → lowest): command-line flag, environment variable,
// hard-coded default. Special-cases:
// - --root / ZDDC_ROOT default to the current working directory if both
// are unset, so an operator can `cd /srv/zddc && zddc-server` with
// zero config.
// - --version and --help return distinguished sentinel errors; the caller
// handles printing and exit. Pass nil for args to skip flag parsing
// entirely (used by tests that set state via env vars only).
//
// Standard usage from main.go:
//
// cfg, err := config.Load(os.Args[1:])
// if errors.Is(err, config.ErrHelpRequested) { os.Exit(0) }
// if errors.Is(err, config.ErrVersionRequested) { /* print versions */ ; os.Exit(0) }
// if err != nil { ... }
func Load(args []string) (Config, error) {
fs := flag.NewFlagSet("zddc-server", flag.ContinueOnError)
// Discard flag's own error output; we wrap and return our own.
fs.SetOutput(io.Discard)
rootFlag := fs.String("root", os.Getenv("ZDDC_ROOT"),
"Path to the served file tree. Default: ZDDC_ROOT or the current directory.")
addrFlag := fs.String("addr", getEnv("ZDDC_ADDR", ":8443"),
"Listen address (host:port). Default: ZDDC_ADDR or :8443.")
tlsCertFlag := fs.String("tls-cert", os.Getenv("ZDDC_TLS_CERT"),
"Path to a PEM TLS certificate. \"none\" disables TLS (plain HTTP). Empty means self-signed.")
tlsKeyFlag := fs.String("tls-key", os.Getenv("ZDDC_TLS_KEY"),
"Path to the matching PEM TLS private key.")
logLevelFlag := fs.String("log-level", getEnv("ZDDC_LOG_LEVEL", "info"),
"Log level: debug, info, warn, error.")
indexPathFlag := fs.String("index-path", getEnv("ZDDC_INDEX_PATH", ".archive"),
"URL segment that triggers the virtual archive index (default \".archive\").")
emailHeaderFlag := fs.String("email-header", getEnv("ZDDC_EMAIL_HEADER", "X-Auth-Request-Email"),
"HTTP header carrying the authenticated user's email.")
corsOriginFlag := fs.String("cors-origin", "",
"Comma-separated CORS allowlist. Empty (default) = CORS disabled. Set to your tool-host origin (e.g. https://tools.acme.com) only if browser-loaded pages from that origin call back into this server.")
insecureDirectFlag := fs.Bool("insecure-direct", os.Getenv("ZDDC_INSECURE_DIRECT") == "1",
"Allow plain HTTP on non-loopback addresses (only safe behind an authenticating proxy).")
insecureFlag := fs.Bool("insecure", os.Getenv("ZDDC_INSECURE") == "1",
"Allow startup with no root .zddc file (the tree is then publicly accessible). Default: refuse to start.")
opaURLFlag := fs.String("opa-url", getEnv("ZDDC_OPA_URL", "internal"),
"Policy decider endpoint: \"internal\" (built-in Go evaluator, default), \"http(s)://host:port\", or \"unix:///path/to/socket\".")
opaFailOpenFlag := fs.Bool("opa-fail-open", os.Getenv("ZDDC_OPA_FAIL_OPEN") == "1",
"External OPA only: on unreachable / non-2xx / malformed response, allow the request instead of denying. Default: fail closed.")
opaCacheTTLFlag := fs.Duration("opa-cache-ttl", parseDurationOrDefault(os.Getenv("ZDDC_OPA_CACHE_TTL"), time.Second),
"External OPA only: per-decision cache TTL. Amortizes round-trips on bursts of identical queries (e.g. .archive listing). Default 1s; set 0 to disable.")
accessLogFlag := fs.String("access-log", os.Getenv("ZDDC_ACCESS_LOG"),
"Tee structured access logs to this file (JSON, size-rotated). "+
"Default: <ZDDC_ROOT>/.zddc.d/logs/access-<hostname>.log. "+
"Set explicitly to empty (--access-log=) to disable.")
helpFlag := fs.Bool("help", false, "Print this help and exit.")
versionFlag := fs.Bool("version", false, "Print version info and exit.")
if args != nil {
if err := fs.Parse(args); err != nil {
if errors.Is(err, flag.ErrHelp) {
return Config{}, ErrHelpRequested
}
return Config{}, err
}
}
if *helpFlag {
return Config{}, ErrHelpRequested
}
if *versionFlag {
return Config{}, ErrVersionRequested
}
// CORS + AccessLog both have "unset → default; explicit-empty →
// disabled" semantics. The flag default is "" in both cases so we
// can't tell unset from explicit-empty via the flag alone —
// fs.Visit catches explicit flag use, and os.LookupEnv catches
// explicit env-var use.
corsFlagSet := false
accessLogFlagSet := false
if args != nil {
fs.Visit(func(f *flag.Flag) {
switch f.Name {
case "cors-origin":
corsFlagSet = true
case "access-log":
accessLogFlagSet = true
}
})
}
_, accessLogEnvSet := os.LookupEnv("ZDDC_ACCESS_LOG")
accessLogExplicit := accessLogFlagSet || accessLogEnvSet
cfg := Config{
Root: *rootFlag,
Addr: *addrFlag,
TLSCert: *tlsCertFlag,
TLSKey: *tlsKeyFlag,
LogLevel: *logLevelFlag,
IndexPath: *indexPathFlag,
EmailHeader: *emailHeaderFlag,
CORSOrigins: resolveCORS(corsFlagSet, *corsOriginFlag),
AccessLog: *accessLogFlag,
Insecure: *insecureFlag,
OPAURL: *opaURLFlag,
OPAFailOpen: *opaFailOpenFlag,
OPACacheTTL: *opaCacheTTLFlag,
}
// Default Root to the current working directory.
if cfg.Root == "" {
cwd, err := os.Getwd()
if err != nil {
return Config{}, fmt.Errorf("--root not set and could not determine current directory: %w", err)
}
cfg.Root = cwd
}
info, err := os.Stat(cfg.Root)
if err != nil {
return Config{}, fmt.Errorf("--root %q is not accessible: %w", cfg.Root, err)
}
if !info.IsDir() {
return Config{}, fmt.Errorf("--root %q is not a directory", cfg.Root)
}
// Refuse to start when the served tree has no root .zddc file. With no
// .zddc anywhere in the chain, AllowedWithChain falls through to its
// "HasAnyFile=false → allow" default, so every directory is publicly
// accessible to anonymous callers. The vast majority of operators do not
// want that — and the few who do (a deliberately public archive) can pass
// --insecure to acknowledge it. See zddc/README.md § Access control.
if !cfg.Insecure {
if _, err := os.Stat(filepath.Join(cfg.Root, ".zddc")); os.IsNotExist(err) {
return Config{}, fmt.Errorf(
"no %s/.zddc file found; the served tree would be publicly accessible to anonymous callers. "+
"Create a starter .zddc (at minimum: `admins: [you@yourcompany.com]`) "+
"or pass --insecure (or ZDDC_INSECURE=1) to acknowledge a deliberately-public deployment",
cfg.Root)
} else if err != nil {
return Config{}, fmt.Errorf("could not stat %s/.zddc: %w", cfg.Root, err)
}
}
// Audit-log default: if neither flag nor env was explicitly set,
// default to <Root>/.zddc.d/logs/access-<hostname>.log so the
// server captures an audit trail by default. Setting the flag/env
// to empty (--access-log=) is the explicit opt-out. Hostname is
// in the filename because operators typically run multiple zddc-
// server replicas against the same dataset (the .zddc.d directory
// is shared FS), and per-host filenames keep the JSON streams
// separable for downstream auditors.
if !accessLogExplicit {
host, herr := os.Hostname()
if herr != nil || host == "" {
host = "unknown"
}
cfg.AccessLog = filepath.Join(cfg.Root, ".zddc.d", "logs",
"access-"+host+".log")
}
// Determine TLS mode.
switch {
case cfg.TLSCert == "none":
cfg.TLSMode = "none"
case cfg.TLSCert == "" && cfg.TLSKey == "":
cfg.TLSMode = "selfsigned"
default:
cfg.TLSMode = "provided"
}
if cfg.TLSMode == "provided" && (cfg.TLSCert == "") != (cfg.TLSKey == "") {
return Config{}, errors.New("--tls-cert and --tls-key must both be set or both be empty")
}
// Plain HTTP mode trusts the email header from any client. Only safe
// behind an authenticating reverse proxy. Refuse to start when binding
// plain HTTP to a non-loopback interface unless the operator has
// explicitly acknowledged the deployment shape.
if cfg.TLSMode == "none" && !isLoopbackAddr(cfg.Addr) && !*insecureDirectFlag {
return Config{}, fmt.Errorf(
"--tls-cert=none binds plain HTTP to %q which trusts %s headers from any client; "+
"either use TLS (omit --tls-cert or supply a cert), bind to loopback (127.0.0.1: or [::1]:), "+
"or pass --insecure-direct to confirm an authenticating reverse proxy is in front",
cfg.Addr, cfg.EmailHeader)
}
return cfg, nil
}
// Usage prints the flag list to w (stderr is the conventional caller).
// Returned format mirrors `flag.PrintDefaults` plus a one-line summary.
func Usage(w io.Writer) {
fmt.Fprintln(w, "Usage: zddc-server [flags]")
fmt.Fprintln(w, "")
fmt.Fprintln(w, "Each flag has an equivalent ZDDC_* environment variable; the flag wins on conflict.")
fmt.Fprintln(w, "ZDDC_ROOT defaults to the current working directory.")
fmt.Fprintln(w, "")
fmt.Fprintln(w, "Flags:")
fs := flag.NewFlagSet("zddc-server", flag.ContinueOnError)
fs.SetOutput(w)
// Re-register flags to populate Usage output (we discard the values).
fs.String("root", "", "Path to the served file tree. Default: ZDDC_ROOT or the current directory.")
fs.String("addr", ":8443", "Listen address (host:port). Default: ZDDC_ADDR or :8443.")
fs.String("tls-cert", "", "Path to a PEM TLS certificate. \"none\" disables TLS. Empty = self-signed.")
fs.String("tls-key", "", "Path to the matching PEM TLS private key.")
fs.String("log-level", "info", "Log level: debug, info, warn, error.")
fs.String("index-path", ".archive", "URL segment for the virtual archive index.")
fs.String("email-header", "X-Auth-Request-Email", "HTTP header carrying the authenticated user's email.")
fs.String("cors-origin", "", "Comma-separated CORS allowlist. Empty (default) = CORS disabled.")
fs.Bool("insecure-direct", false, "Allow plain HTTP on non-loopback addresses.")
fs.Bool("insecure", false, "Allow startup with no root .zddc file (publicly accessible). Default: refuse.")
fs.String("opa-url", "internal", "Policy decider: \"internal\", \"http(s)://...\", or \"unix:///...\".")
fs.Bool("opa-fail-open", false, "External OPA: allow on transport error (default: deny / fail closed).")
fs.Duration("opa-cache-ttl", time.Second, "External OPA: per-decision cache TTL (default 1s; 0 disables).")
fs.String("access-log", "", "Tee structured access logs to this file (JSON, size-rotated). Default <ZDDC_ROOT>/.zddc.d/logs/access-<hostname>.log; --access-log= disables.")
fs.Bool("help", false, "Print this help and exit.")
fs.Bool("version", false, "Print version info and exit.")
fs.PrintDefaults()
}
// resolveCORS implements the precedence rules for the CORS allowlist:
// - flag explicitly set → use flag value (empty = disabled)
// - else env var explicitly set → use env value (empty = disabled)
// - else → default to nil (CORS disabled)
//
// Default-empty is intentional: the embedded-tools deployment path (the install
// default) serves tools and data from the same origin, so CORS is unneeded.
// Operators who deliberately load tools from a different origin (e.g. the
// CDN-bootstrap pattern from https://zddc.varasys.io, or self-hosted at
// https://tools.acme.com) opt in by setting the value explicitly. This avoids
// implicit cross-origin trust on third-party domains.
func resolveCORS(flagSet bool, flagValue string) []string {
if flagSet {
return parseCSV(flagValue)
}
if v, ok := os.LookupEnv("ZDDC_CORS_ORIGIN"); ok {
return parseCSV(v)
}
return nil
}
// parseCSV splits a comma-separated list and trims whitespace. Empty
// returns nil (which the middleware treats as "CORS disabled").
func parseCSV(s string) []string {
if s == "" {
return nil
}
parts := strings.Split(s, ",")
out := make([]string, 0, len(parts))
for _, p := range parts {
if t := strings.TrimSpace(p); t != "" {
out = append(out, t)
}
}
return out
}
// isLoopbackAddr reports whether addr binds only to a loopback interface.
// addr is in net.Listen form: "host:port", ":port", or "[ipv6]:port".
// ":port" means all interfaces, so it is NOT loopback.
func isLoopbackAddr(addr string) bool {
host, _, err := net.SplitHostPort(addr)
if err != nil {
return false
}
if host == "" {
return false
}
if host == "localhost" {
return true
}
ip := net.ParseIP(host)
if ip == nil {
return false
}
return ip.IsLoopback()
}
func getEnv(key, fallback string) string {
if v := os.Getenv(key); v != "" {
return v
}
return fallback
}
// parseDurationOrDefault parses a duration string ("1s", "500ms", "0", etc.).
// Returns def on empty input or parse error. Used for env-var defaults
// that need a sensible fallback rather than a hard error on typo.
func parseDurationOrDefault(s string, def time.Duration) time.Duration {
if s == "" {
return def
}
if d, err := time.ParseDuration(s); err == nil {
return d
}
return def
}