fix(convert): pass --userns=host to inner podman so nested invocations don't trip newuidmap
When zddc-server runs inside a Kubernetes pod and shells out to `podman run`, the inner podman tries to set up its own user namespace via /usr/bin/newuidmap. The mapping fails inside the pod's namespace even with privileged: true: newuidmap: write to uid_map failed: Invalid argument Error: cannot set up namespace using "/usr/bin/newuidmap": exit status 1 Adding --userns=host to the inner `podman run` tells it to reuse the caller's user namespace instead of creating a new one — newuidmap isn't invoked. The chart already runs the pod privileged so reusing its userns adds no new privilege; --cap-drop=ALL + --network=none + --read-only + --tmpfs continue to isolate the inner container. On a bare-metal host invocation, --userns=host means "no userns remapping at all", which is the default for rootful podman and works identically to the prior behavior — the bitnest test setup and any laptop dev runs are unaffected. Smoke-tested locally with the exact flag set: pandoc/latex:latest in a --userns=host --read-only container produces valid HTML from `# Hello world` on stdin.
This commit is contained in:
parent
ab552c8c1b
commit
dfdd767536
1 changed files with 14 additions and 0 deletions
|
|
@ -203,6 +203,20 @@ func (cr *containerRunner) Run(ctx context.Context, image string, stdin []byte,
|
||||||
"--rm",
|
"--rm",
|
||||||
"--pull=missing",
|
"--pull=missing",
|
||||||
"-i",
|
"-i",
|
||||||
|
// --userns=host: reuse the calling process's user namespace
|
||||||
|
// instead of creating a new one. Required for the nested-
|
||||||
|
// podman case (zddc-server runs inside a Kubernetes pod and
|
||||||
|
// invokes podman from there): the kernel won't let the inner
|
||||||
|
// podman set up its own userns via newuidmap when /etc/subuid
|
||||||
|
// mappings don't resolve through the pod's namespace, even
|
||||||
|
// with CAP_SETUID via privileged: true. The chart already
|
||||||
|
// runs the pod privileged, so reusing its userns adds no new
|
||||||
|
// privilege escalation. On a bare-metal host invocation the
|
||||||
|
// outer userns is the host's, so --userns=host means "no
|
||||||
|
// userns remapping" — also fine; --cap-drop=ALL +
|
||||||
|
// --network=none + --read-only continue to isolate the
|
||||||
|
// inner container's process.
|
||||||
|
"--userns=host",
|
||||||
"--network=none",
|
"--network=none",
|
||||||
"--read-only",
|
"--read-only",
|
||||||
"--tmpfs=/tmp:size=128m,exec",
|
"--tmpfs=/tmp:size=128m,exec",
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue