fix(convert): pass --userns=host to inner podman so nested invocations don't trip newuidmap

When zddc-server runs inside a Kubernetes pod and shells out to
`podman run`, the inner podman tries to set up its own user namespace
via /usr/bin/newuidmap. The mapping fails inside the pod's namespace
even with privileged: true:

  newuidmap: write to uid_map failed: Invalid argument
  Error: cannot set up namespace using "/usr/bin/newuidmap": exit status 1

Adding --userns=host to the inner `podman run` tells it to reuse the
caller's user namespace instead of creating a new one — newuidmap
isn't invoked. The chart already runs the pod privileged so reusing
its userns adds no new privilege; --cap-drop=ALL + --network=none +
--read-only + --tmpfs continue to isolate the inner container.

On a bare-metal host invocation, --userns=host means "no userns
remapping at all", which is the default for rootful podman and works
identically to the prior behavior — the bitnest test setup and any
laptop dev runs are unaffected.

Smoke-tested locally with the exact flag set: pandoc/latex:latest in
a --userns=host --read-only container produces valid HTML from
`# Hello world` on stdin.
This commit is contained in:
ZDDC 2026-05-13 12:06:51 -05:00
parent ab552c8c1b
commit dfdd767536

View file

@ -203,6 +203,20 @@ func (cr *containerRunner) Run(ctx context.Context, image string, stdin []byte,
"--rm",
"--pull=missing",
"-i",
// --userns=host: reuse the calling process's user namespace
// instead of creating a new one. Required for the nested-
// podman case (zddc-server runs inside a Kubernetes pod and
// invokes podman from there): the kernel won't let the inner
// podman set up its own userns via newuidmap when /etc/subuid
// mappings don't resolve through the pod's namespace, even
// with CAP_SETUID via privileged: true. The chart already
// runs the pod privileged, so reusing its userns adds no new
// privilege escalation. On a bare-metal host invocation the
// outer userns is the host's, so --userns=host means "no
// userns remapping" — also fine; --cap-drop=ALL +
// --network=none + --read-only continue to isolate the
// inner container's process.
"--userns=host",
"--network=none",
"--read-only",
"--tmpfs=/tmp:size=128m,exec",