ZDDC/helm
ZDDC da4754b6ef feat(convert): bwrap engine as production default
Replaces the always-spawn-an-OCI-container model with a per-call
bubblewrap sandbox. Pandoc and chromium binaries are baked into the
zddc-server runtime image; each conversion runs them under bwrap's
Linux-namespace isolation. No daemon, no socket, no privileged outer
container, no OCI image pull at conversion time.

Why: the OCI engine paid ≈ 350 MB image pulls + 400 MB persistent
storage + ~300 ms per-conversion startup, plus required either an
on-host daemon socket (zddc-RCE → host-RCE in one hop) or nested
container privileges. bwrap gets the same sandbox properties
(--unshare-all, ro-bind /usr, tmpfs /tmp, clearenv, no-network) at
~5 ms per call and zero external dependencies. This is the same
primitive Flatpak uses for every app launch — battle-tested at scale
for "untrusted-input, short-lived, isolated."

Runner abstraction:
- `Runner.Run` signature: image string → ToolSpec{Image, Binary}.
  Both fields populated by entry points; whichever engine is
  installed reads the one it needs.
- `bwrapRunner` (new): assembles bwrap argv via `buildBwrapArgs`
  helper (testable in isolation), spawns bwrap with the binary.
- `containerRunner` (renamed conceptually to "legacy fallback"):
  unchanged behavior, still reachable for hosts that prefer OCI
  containers per conversion.

Probe order in health.Probe: bwrap → podman → docker. First hit wins.
Engine kinds in Capabilities: "bwrap" | "podman" | "docker". The
no-engine error message now lists all three.

Config (cmd/zddc-server):
- new --convert-pandoc-binary  / ZDDC_CONVERT_PANDOC_BINARY  (default "pandoc")
- new --convert-chromium-binary / ZDDC_CONVERT_CHROMIUM_BINARY (default "chromium-browser")
- existing --convert-pandoc-image / --convert-chromium-image kept
  for the OCI engine, doc updated to clarify they only apply there.
- --convert-engine helptext lists bwrap first.

Images:
- New `zddc/runtime.Containerfile` — alpine + bubblewrap + pandoc-cli +
  chromium + font-noto. Documents build/publish workflow.
- helm/zddc-server-prod/values.yaml.example: runtimeImage default
  switched to a placeholder for the new bundled runtime image; bare
  alpine NO LONGER works for /.convert (clearly called out in the
  comment).
- bitnest dev: /var/lib/zddc-dev-build/Containerfile mirrors the
  production runtime image. Quadlet at /etc/containers/systemd/
  zddc.container drops the podman-socket mount (no longer needed)
  and sets ZDDC_CONVERT_ENGINE=bwrap explicitly to avoid silent
  downgrades if a stray podman ends up on PATH.

Tests:
- convert_test.go: fakeRunner / recordingRunner now record ToolSpec.
- New TestToolSpecPopulation pins that both Image and Binary are
  filled by every entry point.
- New TestBwrapArgs_SandboxFlagsPresent / MountTranslation /
  RejectsBadMountSpec lock in the bwrap argv shape — a refactor that
  drops a hardening flag or misroutes a mount fails this loud.

Docs:
- AGENTS.md § "Server-side document conversion" rewritten around
  the bwrap-first model with podman/docker as legacy fallbacks.
- ARCHITECTURE.md convert reference updated.
- internal/convert package doc reflects the two-engine probe order.

Verified end-to-end on bitnest: probe reports
  engine=bwrap pandoc_binary=pandoc chromium_binary=chromium-browser
on startup. All 15 Go test packages green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 17:42:28 -05:00
..
zddc-server-cache fix(client): plug confused-deputy bind in client mode 2026-05-08 10:03:51 -05:00
zddc-server-dev helm: add zddc-server-cache example chart + ZDDC_NO_AUTH on prod/dev 2026-05-08 08:33:01 -05:00
zddc-server-prod feat(convert): bwrap engine as production default 2026-05-18 17:42:28 -05:00
README.md helm: add zddc-server-cache example chart + ZDDC_NO_AUTH on prod/dev 2026-05-08 08:33:01 -05:00

Helm charts

Three example charts for deploying zddc-server on Kubernetes. All compile zddc-server from source via an init container — no container image needs to be pulled from a registry, and no binary needs to be built ahead of time. The init container clones the repo at a configured git ref and runs go build; the main container is plain alpine + the freshly built static binary.

Charts

Chart When to use
zddc-server-prod/ Production master. Pin zddc.gitRef to a stable tag (zddc-server-vX.Y.Z). Slower probe cadence; image-pull policy IfNotPresent. Mounts the data PVC directly RW at ZDDC_ROOT. The token system is enabled automatically (tokens persist on the data PVC at <ZDDC_ROOT>/.zddc.d/tokens/); operators visit /.tokens to issue them.
zddc-server-dev/ Development / soak master. Tracks main by default; helm upgrade triggers a pod recreate so each rollout pulls the latest commit. Faster probes; debug-level logging (request headers logged — sensitive). Wraps the data PVC in OverlayFS (lower = PVC mounted RO, upper = ephemeral emptyDir) so dev-side writes never mutate the underlying store. Use this shape when the dev replica points at the same data as prod.
zddc-server-cache/ Downstream client (proxy / cache / mirror) of an upstream master. Set zddc.upstream.url + zddc.upstream.mode; the binary skips master-side machinery and forwards all requests to the master, persisting responses under the cache PVC (in cache or mirror modes). Bearer auth via a separately-created Kubernetes Secret. Use cases: corporate-master → DR-mirror, vendor-scoped mirror in a vendor's own cluster, regional edge cache, dev environment that mirrors prod read-only. Mirror mode adds an access-triggered subtree walker.

The prod and dev chart values are nearly identical; the differences are encoded as defaults in each chart's values.yaml.example. The dev chart's overlay-isolation layer is a structural difference, not a values-level toggle — see zddc-server-dev/templates/deployment.yaml for the privileged init container and the data-readonly / overlay-scratch / data volume sandwich.

The cache chart shares the same source-build pattern but adds client-mode env wiring (ZDDC_UPSTREAM, ZDDC_MODE, ZDDC_BEARER_FILE, ZDDC_NO_AUTH, ZDDC_SKIP_TLS_VERIFY, mirror-mode subtree config), a Recreate strategy (single-instance — multiple replicas would race the cache directory), and TCP-socket probes (HTTP probes against / would fail when both upstream is down AND the cache is empty).

Quick start

# Pre-requisite: a PersistentVolumeClaim for ZDDC_ROOT data
kubectl apply -f - <<'EOF'
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: zddc-root
spec:
  accessModes: [ReadWriteMany]    # or RWO if single replica is fine
  resources: { requests: { storage: 100Gi } }
  storageClassName: your-shared-fs   # NFS, CephFS, SMB, etc.
EOF

# Production install
cp helm/zddc-server-prod/values.yaml.example my-prod-values.yaml
$EDITOR my-prod-values.yaml          # set zddc.gitRef, hostnames, etc.
helm install zddc-server-prod helm/zddc-server-prod/ -f my-prod-values.yaml

# Dev install (tracks main HEAD)
cp helm/zddc-server-dev/values.yaml.example my-dev-values.yaml
$EDITOR my-dev-values.yaml
helm install zddc-server-dev helm/zddc-server-dev/ -f my-dev-values.yaml

# Trigger a rebuild from latest main HEAD (dev chart)
helm upgrade zddc-server-dev helm/zddc-server-dev/ -f my-dev-values.yaml

# Cache install (downstream client of an upstream master)
#
#   1) Issue a bearer token on the master at https://<master>/.tokens
#   2) Create the Secret (do NOT put the token in values.yaml):
kubectl create secret generic zddc-cache-bearer \
  --from-literal=token=<paste-token-here>

#   3) Create a cache PVC (separate from the master's data PVC; can
#      be smaller — sized to the working set you expect to mirror):
kubectl apply -f - <<'PVC'
apiVersion: v1
kind: PersistentVolumeClaim
metadata: { name: zddc-cache }
spec:
  accessModes: [ReadWriteOnce]
  resources: { requests: { storage: 50Gi } }
  storageClassName: your-block-storage
PVC

#   4) Install the chart, pointing at your master:
cp helm/zddc-server-cache/values.yaml.example my-cache-values.yaml
$EDITOR my-cache-values.yaml      # set zddc.upstream.url, mode, etc.
helm install zddc-server-cache helm/zddc-server-cache/ -f my-cache-values.yaml

What the chart does and doesn't do

Does:

  • Clones the configured zddc.gitRepo at zddc.gitRef in an init container, builds the Go binary, copies it to a shared emptyDir, and starts the main container against that binary.
  • Wires the ZDDC_* environment-variable contract (root path, addr, email header, CORS allowlist, log level, index path).
  • Mounts a caller-supplied PersistentVolumeClaim at ZDDC_ROOT (prod chart) or as the OverlayFS lowerdir behind a merged ZDDC_ROOT (dev chart).
  • Optionally creates an Ingress (ingress.enabled: true).

Does not:

  • Create the PVC. Operators provision storage themselves; the chart only references it by name.
  • Manage TLS for the pod. zddc-server runs in plain HTTP mode behind whatever ingress / authenticating reverse proxy the cluster already has. ZDDC_TLS_CERT=none and ZDDC_INSECURE_DIRECT=1 are hardcoded in the templates because the chart is opinionated about the TLS-terminated-upstream deployment shape.
  • Authenticate users. zddc-server reads the user's email from a header set by the upstream proxy (X-Auth-Request-Email by default). The chart does not deploy oauth2-proxy / nginx-auth-request / Pomerium / etc. — bring your own.
  • Manage secrets. values.yaml.example contains no secrets and never should. ACL email lists belong in .zddc files inside the data volume; image-pull credentials and TLS certs (if you enable ingress TLS) reference Kubernetes secrets you've created separately.

Why build from source instead of using a registry image

Three reasons:

  1. Reproducibility. The init container's logs show exactly which git ref was built. There's no opaque "what did I deploy" question that a registry tag can introduce.
  2. One distribution channel. Codeberg release-asset binaries already exist for direct downloads; the chart compiles its own binary from the same source git ref so there's nothing extra to maintain (no separate image registry, no image-promotion pipeline).
  3. Smaller blast radius. A compromised build image affects only pods that pull during the compromise window. A compromised registry image stays compromised across rollbacks until the digest is rotated.

The cost: every pod start takes 30-60s to clone + go build instead of pulling a pre-baked image. Acceptable for both chart audiences (production rollouts are infrequent; dev rollouts trade build time for tracking-main convenience).

Linting

helm lint helm/zddc-server-prod/
helm lint helm/zddc-server-dev/
helm lint helm/zddc-server-cache/

# Render to inspect (uses default values from values.yaml.example):
helm template test-prod helm/zddc-server-prod/ \
  --values helm/zddc-server-prod/values.yaml.example

helm template test-cache helm/zddc-server-cache/ \
  --values helm/zddc-server-cache/values.yaml.example