Today v0.0.19 surfaced a real failure mode: varasys → codeberg push-
mirror is `sync_on_commit: true`, but a transient codeberg 504 mid-
push left 2 of 8 tags un-replicated. BMC chart's Dockerfile fetches
zddc-server-v<X.Y.Z> from codeberg (no egress to git.varasys.io),
so the bumped chart fired BMC pipelines that immediately failed at
`git fetch refs/tags/zddc-server-v0.0.19`. Mirror's next periodic
push (8h default) would self-heal — but by then dev was broken.
Make the stable-cut deterministic: before bumping the chart, force
the push-mirror via the Forgejo API and poll codeberg until all 8
lockstep tags are visible. Fail the job (and skip the chart bump)
if codeberg is genuinely unreachable after 5 min — operator triages
manually rather than triggering downstream builds against a stale
codeberg.
Uses ${{ github.token }} (Forgejo Actions auto-injected) for the
push_mirrors-sync API call. If that token turns out to lack admin
scope on this repo (Forgejo specifics around runner-token perms
vary), the failure will be a clear 401/403 on the curl — switch
to a dedicated CHART_FORGEJO_TOKEN-style secret then.
Local repro:
FORGEJO_TOKEN=$FORGEJO_TOKEN curl -X POST \
-H "Authorization: token $FORGEJO_TOKEN" \
https://git.varasys.io/api/v1/repos/VARASYS/ZDDC/push_mirrors-sync
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the long-standing chart-bump race that required a manual rebase
on every beta cut. Three coordinated changes:
build (top-level): broaden the existing stable-only "fold embedded
artifacts into a release commit" block to also fire on beta cuts.
Same idempotency check; new commit message ("chore(embedded): cut
v<X.Y.Z>-beta") derived via _coordinated_next_stable. Tagging stays
stable-only (channels are mutable mirrors and never get tags). Beta
cuts now produce exactly one commit on main; HEAD always carries
the bytes the binary will serve.
shared/build-lib.sh: drop the SHA from alpha/beta channel labels.
Embedding HEAD's SHA in the bytes the SHA identifies created a
feedback loop — each auto-commit advanced HEAD, which shifted the
SHA in the next run's versions.txt, which triggered another
embedded commit, ad infinitum. Channel labels now read
"v<X.Y.Z>-<channel> · <date>" — version + date is enough; SHA
traceability lives in the chart's appVersion (full SHA) and the
binary's --version output. Plain dev builds keep the timestamp +
-dirty fingerprint since they don't commit. Stable cuts already
use a clean version-only label.
.forgejo/scripts/notify-chart-bump.sh: pin the chart's appVersion
to `git rev-parse HEAD` instead of the SHA in versions.txt. The
build's auto-commit now ensures HEAD == "the commit containing the
embedded bytes the binary will bake," so HEAD is the substantively
correct anchor. The previous versions.txt read pinned one commit
too early (the source-side commit, before the embed refresh
committed) — every beta cut required a manual chart-rebase to
point at the embed commit. With both halves landed, the cycle is
zero-touch: ./build beta + git push → auto-bump CI fires → chart
appVersion at correct SHA → dev image bakes the right bytes.
Verification: ran ./build beta twice on the same source state. First
run produced one commit; second run printed "no embedded changes to
commit (re-run on same source state)" and made no commit. The label
SHA-loop bug is fixed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Forgejo's uploadpack.allowAnySHA1InWant only matches FULL SHAs —
the 7-char short SHA from build-label / versions.txt produces
"couldn't find remote ref ae75855" on `git fetch --depth=1 origin
<short-sha>` (and `git clone --branch <short-sha>` fails the same
way). The chart's downstream Dockerfile uses fetch-by-ref to handle
SHAs as well as named refs, but only full SHAs go through.
Resolve the short SHA to its full form via `git rev-parse` in
notify-chart-bump.sh before writing the chart's appVersion. The
runner has the full git history (actions/checkout@v4 fetch-depth: 0),
so rev-parse works locally; the chart's appVersion becomes the
canonical 40-char SHA, and the BMCD pipeline-dev / pipeline-prod
fetches succeed cleanly.
Manually re-bumped chart develop after this commit:
appVersion: 0.0.16-beta-ae75855 → 0.0.16-beta-ae758550a855f6a9507df08075475cb87cb67086
Chart version was monotonically incrementing PATCH on every chart-bump
— after ~50 betas + a few stable cuts, version was already at 0.2.12.
Triple digits would land within a year of active dev.
Switch the stable-cut branch of notify-chart-bump.sh to bump MINOR
and reset PATCH to 0:
stable: 0.2.X → 0.3.0
beta: 0.3.0 → 0.3.1, 0.3.1 → 0.3.2, ...
next stable: 0.3.X → 0.4.0
Patches stay bounded (≈ betas-per-stable, not all-time). MINOR keeps
incrementing so JFrog chart-repo uploads stay accepted (it rejects
duplicate chart-version numbers — a literal "reset to 0.2.0" cycle
would break uploads on the second stable cut).
Chart version is purely JFrog packaging metadata; the zddc-server
version users actually care about lives in appVersion.
./build beta runs locally at HEAD=N, generates embed/*.html with
build-label SHA=N, then the operator commits the generated bytes
as commit N+1. After push, git HEAD on the runner = N+1 but the
served website's BUILD_LABEL still encodes N (it's baked into the
HTML at build time, before the commit).
Previously the bump script computed BETA_VERSION using
`git rev-parse --short HEAD` = N+1, so the chart's appVersion
(and the dev image's tag, and the kubelet's pull) said N+1 while
the served label said N. Two SHAs for the same bytes — confusing
when triaging "is this image actually the latest?".
Read the SHA from zddc/internal/apps/embedded/versions.txt instead
(third pipe-delimited field of any line). That's the single source
of truth for what bytes were baked, and it lines up with what users
see in every tool's header.
Manually re-bumped chart develop after committing this script:
appVersion: 0.0.16-beta-9a3e4d8 → 0.0.16-beta-8df0def
Both notify-chart-dev.yml and the notify-chart-prod job in
deploy-release.yml were carrying ~80 lines of inline shell each,
duplicating the clone-bump-push flow. Extract to a single script:
.forgejo/scripts/notify-chart-bump.sh <beta|stable> [VERSION]
Three benefits:
1. **Locally testable**. The script is invocable directly:
CHART_FORGEJO_TOKEN=$FORGEJO_TOKEN \
.forgejo/scripts/notify-chart-bump.sh beta
No more "push to main and watch what the runner does" debug loop.
2. **Manual escape hatch**. When CI is broken, the same script is
how we recover. The 0.0.16-beta-1ddd331 chart bump preceding
this commit was performed via this very script.
3. **Runner-quirk-immune**. The previous three commits chased a
Forgejo runner v12.9.0 phantom-SIGPIPE bug that would only
surface under the runner's `bash -e -o pipefail {0}` wrapper.
A real script with its own `#!/usr/bin/env bash` and explicit
error handling sidesteps the wrapper entirely.
The workflow YAMLs shrink to checkout + run-script. No GITHUB_OUTPUT
plumbing, no inline if/else gates, no shell flag overrides. The
behavior is identical to the prior inline versions.
The chart repo (BMCD/tnd-zddc-chart) is mirrored Forgejo→GitHub
one-way (we set this up so the chart matches the same canonical-
on-Forgejo pattern as the public repos). When notify-chart-prod
and notify-chart-dev pushed directly to GitHub, the bump landed
on GitHub but Forgejo never got it — and the next time Forgejo's
push-mirror ran, it force-overwrote GitHub's bump with Forgejo's
older state. Symptom: prod stuck at v0.0.9 even after auto-bump
appeared to succeed; manual investigation showed Chart.yaml
appVersion was actually still 0.0.10 (the previous manual bump
that DID land on Forgejo).
Fix: clone+push to Forgejo (git.varasys.io/BMCD/tnd-zddc-chart)
instead of GitHub. Forgejo's mirror replicates to GitHub on the
next sync — going through the canonical-Forgejo path keeps both
sides in sync. Uses a new CHART_FORGEJO_TOKEN secret (separate
from CHART_GITHUB_TOKEN, which is no longer needed for these
workflows but kept for any future direct-GitHub use case).
Per the bake-in invariant: when no active beta exists, dev tracks
stable. Previously notify-chart-prod only bumped chart's main, so
a stable cut updated BMCD prod but left dev one cut behind until
the next beta or manual dispatch.
Now the job loops through {main, develop} and pushes the same
appVersion bump to both. Each branch push triggers its own BMCD
pipeline (pipeline-prod for main, pipeline-dev for develop), so
prod + dev both rebuild against the new ZDDC stable in parallel.
notify-chart-dev.yml continues to handle the beta-cut path
(advances develop ahead of main between stable cuts).
Multi-line git commit message bodies broke YAML parsing — pipe blocks
end on unindented lines, so the body lines starting at column 0 were
being interpreted by Forgejo's YAML parser as keys, yielding:
yaml: line 158: could not find expected ':'
Switch to repeated `-m` flags (one per paragraph). Same end result
in git log; valid YAML.
Closes the loop on the user-described workflow:
1. Iterate on tools / cut alpha → no chart involvement.
2. `./build beta` → embedded/ commits to ZDDC main →
notify-chart-dev.yml pushes a chart appVersion bump to
burnsmcd/tnd-zddc-chart's develop branch → BMCD pipeline-dev
fires automatically → dev image rebuilt with new beta bytes
baked in.
3. `./build release` → tag pushed → existing deploy-release.yml's
new notify-chart-prod job pushes a chart appVersion bump to
burnsmcd/tnd-zddc-chart's main branch → BMCD pipeline-prod
fires automatically → prod image rebuilt with new stable bytes.
The chart repo IS still committed to (one Chart.yaml line, auto-
generated by either workflow), but no human ever touches it for
routine ZDDC releases. The chart commits are idempotent (skip if
appVersion already at target) and clearly marked as bot-generated.
The truly chart-commit-free version would require either (a)
BMCD's private helm-deploy-latest reusable to accept --set overrides
we'd compute, or (b) bypassing it entirely with our own helm step.
Both are deeper changes than this PR; this is the simplest reliable
solution within the existing reusable.
Auth: a new repo-scoped Forgejo Actions secret CHART_GITHUB_TOKEN
holds the classic GitHub PAT (already provisioned for the
Forgejo→GitHub mirror; same token, repo+workflow scopes,
SAML-SSO authorized for burnsmcd). The bot identity is
'ZDDC Release Bot <noreply@zddc.varasys.io>'.
Tested behavior:
- Workflow files are added by THIS commit. Pushing this commit
does not fire either workflow (notify-chart-prod requires a
tag; notify-chart-dev requires changes under
zddc/internal/apps/embedded/). Safe to land before testing.
- First real test fires on the next ZDDC stable cut or beta cut.
Runner now runs in a quadlet container on caddy-net, so 127.0.0.1
is the runner's own loopback. Reach the Caddy container by name
('caddy') with --connect-to keeping SNI/Host as the public hostname
so the right vhost matches.
Also adds the tag trigger: push of zddc-server-v[0-9]+.[0-9]+.[0-9]+
auto-cuts a stable release. The lockstep set pushes six tags at once;
filtering on zddc-server-v* gives exactly one workflow run per cut.
Re-cutting at the tagged commit is safe — _promote_stable in
shared/build-lib.sh is idempotent re: tag creation.
Forgejo Actions workflow that runs ./build alpha|beta|release [version]
followed by ./deploy --releases. Uses the host-mode runner so the
behavior is identical to manual cuts. Tag-trigger added later once
the dispatch path is exercised.