ZDDC/helm
ZDDC cef7188a77 refactor(convert): wrapper-in-image owns the sandbox; Go just exec's binaries
The bwrap engine + OCI engine that lived in internal/convert/runner.go
both leak isolation policy into Go code. Replaced with a single image-
side wrapper that drop-in-shadows pandoc and chromium-browser on PATH.
zddc-server's only contract with the image is now "exec.Command(name,
args) gets you that tool's behavior" — sandboxing, resource caps, and
namespace setup live entirely in shell scripts shipped by the image.

Architecture:
- zddc/runtime/zddc-cgroup-init runs at container start. cgroup v2's
  "no internal processes" constraint forbids a cgroup from having both
  children and processes; the init script moves PID 1 into a child,
  enables +memory +pids in subtree_control, then exec's zddc-server.
  Best-effort: degrades cleanly to "no resource caps" if cgroupfs
  isn't writable.
- zddc/runtime/zddc-sandbox-exec is the per-call wrapper, symlinked
  from /usr/local/bin/{pandoc,chromium-browser}. Creates a transient
  cgroup v2 (memory.max + pids.max), then bubblewrap-sandboxes the
  real binary at /usr/bin/<name>: --unshare-all, --ro-bind /usr,
  --proc /proc, --tmpfs /tmp, --clearenv. Caller's scratch dir comes
  in via ZDDC_SCRATCH env and is bind-mounted at the SAME path so
  absolute paths round-trip unchanged.

Go simplifications (~250 lines net deletion):
- Runner interface: Run(ctx, binary, stdin, scratchDir, cmd) — no
  ToolSpec, no mount list, no engine concept. Single localRunner
  implementation; bwrapRunner + containerRunner both deleted.
- health.Probe just looks up pandoc + chromium on PATH; Capabilities
  drops engine kinds.
- Convert.go: ToHTML/ToPDF write to a per-call scratch dir under
  TMPDIR and pass absolute paths; the wrapper bind-mounts the dir.
  No more "/tpl" / "/pdf" mount-point indirection.
- Config drops --convert-pandoc-image, --convert-chromium-image,
  --convert-engine, --convert-podman-socket (OCI engine gone) and
  --convert-cpus (CPU caps don't apply in the new model — wall-clock
  + memory + pids is the cap set). Defaults raised to match the new
  caps the user authorized: mem 512→1024 MiB, pids 100→256,
  timeout 30→60 s.

Image:
- zddc/runtime.Containerfile builds the production runtime image
  (alpine + bubblewrap + pandoc + chromium + font-noto). Two
  COPY statements pull in the wrapper scripts; ln -s symlinks the
  shadow names.
- bitnest dev image mirrors this layout under /var/lib/zddc-dev-build/.

Container privilege required:
- Nested bwrap needs the outer container to permit user + mount
  namespace creation + MS_SLAVE on root. The default seccomp +
  AppArmor profiles block all of these. Quadlet adds:
    --cap-add=ALL
    --security-opt=seccomp=unconfined
    --security-opt=apparmor=unconfined
    --security-opt=unmask=ALL
  Helm chart sets the equivalent via securityContext (capabilities.
  add: SYS_ADMIN, seccompProfile.type: Unconfined, appArmorProfile.
  type: Unconfined). Trade-off documented in AGENTS.md: zddc-server
  RCE now has near-root power within the container, but the bind-
  mount layout still bounds blast radius; bwrap is the real boundary
  between zddc-server and untrusted markdown.

Tests: convert_test.go fully rewritten for the new Runner signature.
Drops TestBwrapArgs_* (functionality moved out of Go) and
TestImageTag (no more image refs). All 15 Go test packages green.

Verified live on bitnest: pandoc --version round-trip exits 0
through the wrapper; MD→DOCX produces a valid Word 2007+ file
end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 07:47:58 -05:00
..
zddc-server-cache fix(client): plug confused-deputy bind in client mode 2026-05-08 10:03:51 -05:00
zddc-server-dev helm: add zddc-server-cache example chart + ZDDC_NO_AUTH on prod/dev 2026-05-08 08:33:01 -05:00
zddc-server-prod refactor(convert): wrapper-in-image owns the sandbox; Go just exec's binaries 2026-05-19 07:47:58 -05:00
README.md helm: add zddc-server-cache example chart + ZDDC_NO_AUTH on prod/dev 2026-05-08 08:33:01 -05:00

Helm charts

Three example charts for deploying zddc-server on Kubernetes. All compile zddc-server from source via an init container — no container image needs to be pulled from a registry, and no binary needs to be built ahead of time. The init container clones the repo at a configured git ref and runs go build; the main container is plain alpine + the freshly built static binary.

Charts

Chart When to use
zddc-server-prod/ Production master. Pin zddc.gitRef to a stable tag (zddc-server-vX.Y.Z). Slower probe cadence; image-pull policy IfNotPresent. Mounts the data PVC directly RW at ZDDC_ROOT. The token system is enabled automatically (tokens persist on the data PVC at <ZDDC_ROOT>/.zddc.d/tokens/); operators visit /.tokens to issue them.
zddc-server-dev/ Development / soak master. Tracks main by default; helm upgrade triggers a pod recreate so each rollout pulls the latest commit. Faster probes; debug-level logging (request headers logged — sensitive). Wraps the data PVC in OverlayFS (lower = PVC mounted RO, upper = ephemeral emptyDir) so dev-side writes never mutate the underlying store. Use this shape when the dev replica points at the same data as prod.
zddc-server-cache/ Downstream client (proxy / cache / mirror) of an upstream master. Set zddc.upstream.url + zddc.upstream.mode; the binary skips master-side machinery and forwards all requests to the master, persisting responses under the cache PVC (in cache or mirror modes). Bearer auth via a separately-created Kubernetes Secret. Use cases: corporate-master → DR-mirror, vendor-scoped mirror in a vendor's own cluster, regional edge cache, dev environment that mirrors prod read-only. Mirror mode adds an access-triggered subtree walker.

The prod and dev chart values are nearly identical; the differences are encoded as defaults in each chart's values.yaml.example. The dev chart's overlay-isolation layer is a structural difference, not a values-level toggle — see zddc-server-dev/templates/deployment.yaml for the privileged init container and the data-readonly / overlay-scratch / data volume sandwich.

The cache chart shares the same source-build pattern but adds client-mode env wiring (ZDDC_UPSTREAM, ZDDC_MODE, ZDDC_BEARER_FILE, ZDDC_NO_AUTH, ZDDC_SKIP_TLS_VERIFY, mirror-mode subtree config), a Recreate strategy (single-instance — multiple replicas would race the cache directory), and TCP-socket probes (HTTP probes against / would fail when both upstream is down AND the cache is empty).

Quick start

# Pre-requisite: a PersistentVolumeClaim for ZDDC_ROOT data
kubectl apply -f - <<'EOF'
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: zddc-root
spec:
  accessModes: [ReadWriteMany]    # or RWO if single replica is fine
  resources: { requests: { storage: 100Gi } }
  storageClassName: your-shared-fs   # NFS, CephFS, SMB, etc.
EOF

# Production install
cp helm/zddc-server-prod/values.yaml.example my-prod-values.yaml
$EDITOR my-prod-values.yaml          # set zddc.gitRef, hostnames, etc.
helm install zddc-server-prod helm/zddc-server-prod/ -f my-prod-values.yaml

# Dev install (tracks main HEAD)
cp helm/zddc-server-dev/values.yaml.example my-dev-values.yaml
$EDITOR my-dev-values.yaml
helm install zddc-server-dev helm/zddc-server-dev/ -f my-dev-values.yaml

# Trigger a rebuild from latest main HEAD (dev chart)
helm upgrade zddc-server-dev helm/zddc-server-dev/ -f my-dev-values.yaml

# Cache install (downstream client of an upstream master)
#
#   1) Issue a bearer token on the master at https://<master>/.tokens
#   2) Create the Secret (do NOT put the token in values.yaml):
kubectl create secret generic zddc-cache-bearer \
  --from-literal=token=<paste-token-here>

#   3) Create a cache PVC (separate from the master's data PVC; can
#      be smaller — sized to the working set you expect to mirror):
kubectl apply -f - <<'PVC'
apiVersion: v1
kind: PersistentVolumeClaim
metadata: { name: zddc-cache }
spec:
  accessModes: [ReadWriteOnce]
  resources: { requests: { storage: 50Gi } }
  storageClassName: your-block-storage
PVC

#   4) Install the chart, pointing at your master:
cp helm/zddc-server-cache/values.yaml.example my-cache-values.yaml
$EDITOR my-cache-values.yaml      # set zddc.upstream.url, mode, etc.
helm install zddc-server-cache helm/zddc-server-cache/ -f my-cache-values.yaml

What the chart does and doesn't do

Does:

  • Clones the configured zddc.gitRepo at zddc.gitRef in an init container, builds the Go binary, copies it to a shared emptyDir, and starts the main container against that binary.
  • Wires the ZDDC_* environment-variable contract (root path, addr, email header, CORS allowlist, log level, index path).
  • Mounts a caller-supplied PersistentVolumeClaim at ZDDC_ROOT (prod chart) or as the OverlayFS lowerdir behind a merged ZDDC_ROOT (dev chart).
  • Optionally creates an Ingress (ingress.enabled: true).

Does not:

  • Create the PVC. Operators provision storage themselves; the chart only references it by name.
  • Manage TLS for the pod. zddc-server runs in plain HTTP mode behind whatever ingress / authenticating reverse proxy the cluster already has. ZDDC_TLS_CERT=none and ZDDC_INSECURE_DIRECT=1 are hardcoded in the templates because the chart is opinionated about the TLS-terminated-upstream deployment shape.
  • Authenticate users. zddc-server reads the user's email from a header set by the upstream proxy (X-Auth-Request-Email by default). The chart does not deploy oauth2-proxy / nginx-auth-request / Pomerium / etc. — bring your own.
  • Manage secrets. values.yaml.example contains no secrets and never should. ACL email lists belong in .zddc files inside the data volume; image-pull credentials and TLS certs (if you enable ingress TLS) reference Kubernetes secrets you've created separately.

Why build from source instead of using a registry image

Three reasons:

  1. Reproducibility. The init container's logs show exactly which git ref was built. There's no opaque "what did I deploy" question that a registry tag can introduce.
  2. One distribution channel. Codeberg release-asset binaries already exist for direct downloads; the chart compiles its own binary from the same source git ref so there's nothing extra to maintain (no separate image registry, no image-promotion pipeline).
  3. Smaller blast radius. A compromised build image affects only pods that pull during the compromise window. A compromised registry image stays compromised across rollbacks until the digest is rotated.

The cost: every pod start takes 30-60s to clone + go build instead of pulling a pre-baked image. Acceptable for both chart audiences (production rollouts are infrequent; dev rollouts trade build time for tracking-main convenience).

Linting

helm lint helm/zddc-server-prod/
helm lint helm/zddc-server-dev/
helm lint helm/zddc-server-cache/

# Render to inspect (uses default values from values.yaml.example):
helm template test-prod helm/zddc-server-prod/ \
  --values helm/zddc-server-prod/values.yaml.example

helm template test-cache helm/zddc-server-cache/ \
  --values helm/zddc-server-cache/values.yaml.example