metronome/docs/rust-port.md
Me Here dd27d553fe pm-grid: live re-read of programs.json (no reboot needed)
When the host writes the drive (SCSI Write sets a dirty flag) and the drive has
been idle ~1.5s AND playback is stopped, the loop re-reads programs.json and
rebuilds the set lists (reload_user) -> a dropped file applies without a reboot.

Read-only path (split read_programs_json out of read_user_setlists; the format
flash-write only happens at boot), so no FAT-corruption risk from dual access.

Note on the recommended write path: the device deliberately does NOT write the
shared FAT while the host has it mounted (that corrupts the host cache - same
reason CircuitPython is one-direction-at-a-time). The practice log should instead
go to the editor via LOGSYNC (0x45); settings.json *read* (device read-only) is a
safe follow-up. Documented in docs/rust-port.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 10:19:43 -05:00

275 lines
20 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Rust port — staged plan
This is the plan for the native-Rust direction first discussed alongside the A/B-bootloader
idea. **What changed since then:** the track format is now formally specced
(`docs/track-format.md`) with a golden-vector conformance suite (`tests/run.mjs`). That suite is
the thing that makes a port *safe* — any Rust engine has a precise, executable definition of
"correct" to validate against, the same one `engine.js` and `app.py` already pass.
## The core idea
Port from the **inside out**, lowest-risk first. The pure logic (track codec, then the
scheduler) is host-testable with zero hardware and is gated by the existing golden vectors. Only
once that's proven do we touch drivers, A/B, and the actual firmware. We do **not** rip out
CircuitPython until the Rust engine passes the vectors *and* the drivers are proven on hardware.
## Architecture: one firmware core, modular drivers per form factor
Trying many form factors (Kit, Explorer, **Grid**/Scroll Pack, …) is how we *discover the line
between core and driver*. In Rust that line is enforced by the type system instead of copied by
hand — today each CircuitPython form factor is its own ~1,500-line `app.py` clone; the Rust build
is one core crate plus a thin per-board binary.
**`pm-core` — the core (`no_std`, zero hardware):**
- the track-format codec (`rust/track-format`, Stage 1) and the scheduler/clock (Stage 2, already
`no_std` and building for RP2350),
- playback-flow (rep/end/continue, segment seams), app state, set-list model,
- the **USB-MIDI / live-sync / firmware-update protocol** logic (the SysEx opcode handling, which
is form-factor-independent).
It is host-testable and gated by the golden vectors — the same suite `engine.js` and `app.py`
pass. **This is "core."**
**Driver traits — what the core is generic over (the swappable part):** define small project
traits — `Display` (or render straight to an `embedded-graphics` `DrawTarget`), `Inputs` (yields
button / touch events), `Clicker` (audio out), `Indicator` (RGB) — and write each concrete driver
against **`embedded-hal`** bus traits (`I2c`, `SpiBus`, `OutputPin`, `DelayNs`). The core's UI code
then doesn't care whether the target is a 17×7 mono matrix or a 320×480 colour TFT.
**Per-board binary crates — `pm-kit`, `pm-explorer`, `pm-grid`:** a thin `main.rs` BSP that
instantiates the right concrete drivers and hands them to the generic core:
- **Grid** (Scroll Pack): IS31FL3731 over I²C (a `DrawTarget` for a 17×7 mono frame) + 4 GPIO buttons.
- **Kit:** ST7796 320×480 over **SPI** — driven by a **custom `St7796` struct** (direct port of
`pico/main.py`), **not** mipidsi, which fought the panel's geometry/CS (see [[rust-st7796-cs-gotcha]]);
UI still renders through an `embedded-graphics` framebuffer. GT911 touch over I²C; WS2812 via
`ws2812-pio`; I²S to the PCM5102A via PIO.
- **Explorer:** ST7789V 320×240 over an **8-bit parallel (8080) bus** via `mipidsi`'s
`ParallelInterface` — a different driver story from the Kit's SPI (see the matrix below).
### Display driver matrix (researched 2026-06-03)
Displays are not the gate — every controller has a real Rust path; the buses differ:
| Form factor | Controller | Bus | Rust driver | Status |
|---|---|---|---|---|
| Kit (`pm-kit`) | ST7796 320×480 | SPI | **custom `St7796`** (port of `pico/main.py`) + `embedded-graphics` framebuffer; **mipidsi dropped** | ✅ on hardware — **but tearing** (see below) |
| Explorer (`pm-explorer`) | ST7789V 320×240 | 8-bit parallel 8080 | `mipidsi` `ParallelInterface` *(start here; be ready to port directly if geometry fights, as on the Kit)* | path proven upstream; not yet built |
| Grid (`pm-grid`) | IS31FL3731 17×7 mono | I²C | **vendored bulk-framebuffer driver** (port of `pico-scroll/app.py`'s `Matrix`); `is31fl3731` crate *not* used | ✅ **built + compiles** (LED-first milestone) — see below |
| Kit touch | GT911 | I²C | `gt911` or `gt9x` crate | ✅ mature (blocking + async, 5-point) |
**ST7796 (Kit) — only *partially* working: tearing.** Pixels are correct and the panel boots, but the
image tears badly. The cause is structural, not a bug: `mipidsi` has **no TE-pin / vsync / partial-update
/ double-buffer support** (confirmed against the upstream repo `github.com/almindor/mipidsi` — it offers
only optional draw *batching*), and this ST7796 module doesn't break out the TE (tearing-effect) line to
sync writes against scan-out. So writes race the panel's refresh. Mitigations available to us, none from
the crate: (a) redraw only changed full-width row-bands to shrink the tear window — already done; (b) DMA
each band as one tight burst; (c) sync to a TE GPIO *only if* a module that exposes that pin is sourced.
Treat tearing as an **open hardware/firmware item**, not "display done." See [[rust-st7796-cs-gotcha]].
**Explorer parallel bus — correctness and performance are decoupled.** `mipidsi`'s `ParallelInterface`
drives the data pins through an `OutputBus` *trait*. The shipped `Generic8BitBus` is plain GPIO bit-bang
(`embedded-hal` `OutputPin`s) — works immediately, just CPU-bound. For speed, implement `OutputBus` over
**RP2350 PIO** (the PIO supports 8080/6800 bus timing; the C/TFT_eSPI world hits ~4 ms for a full 320×480
clear this way) — a drop-in swap that leaves mipidsi + `embedded-graphics` + `pm-ui` untouched. Worst case
is "slow but functional," never "impossible," so the bit-bang fallback de-risks the whole Explorer bring-up.
**The honest caveat (what the Grid prototype is teaching us):** a 17×7 mono grid and a 320×480
touch TFT are too different for *one* pixel-identical UI. So the clean split is **core engine +
protocol + state = fully shared; the *view* = per-display-class.** The Grid is the most extreme,
minimal display in the lineup, which makes it the best forcing-function for finding exactly where
that boundary falls before we commit drivers to Rust. The CircuitPython `pico-scroll/` build exists
to nail that UI down on real hardware first.
## Stages
### Stage 0 — toolchain in a container
Add a Rust toolchain image (mirroring `hardware/eda/`): a `Containerfile` with `rustup`, the
`thumbv8m.main-none-eabihf` target (RP2350 is Cortex-M33), `flip-link`, `probe-rs`, `elf2uf2`.
Driven by a `run.sh` like the EDA one. **Never on the host.**
### Stage 1 — `track-format` crate ✅ DONE (`rust/track-format/`)
Implemented and **passing**: `./rust/run.sh` builds the container and runs `cargo test`, which
validates the crate against `tests/fixtures/track-format.json` (conformance + idempotency). The
Rust codec agrees with `engine.js` and `app.py` on every vector — and carries `vol`/`cd`, so it's
the most spec-complete of the three. Original scope below.
#### (original) Stage 1 — `track-format` crate ← the concrete first PR
A pure, `no_std`-compatible crate: `parse(&str) -> Track` and `serialize(&Track) -> String`,
plus a `normalize()` that emits the neutral structure from `docs/track-format.md` §5. Then a
`cargo test` that reads `tests/fixtures/track-format.json` and asserts each case's `norm` and
round-trip — i.e. a **third adapter alongside `js_adapter.mjs` / `py_adapter.py`**. When this is
green, the Rust engine provably agrees with web + device on every groove, euclid, swing, ghost,
polymeter, and the playback-flow tokens. No hardware, fully testable in the container.
This is the highest-value slice: small, gated by work already done, and it proves the toolchain.
### Stage 2 — scheduler/engine ✅ DONE (`rust/track-format/src/schedule.rs`)
Ported the look-ahead step scheduler (the `durs` math from `app.py` `tick`/`_prepare_next`).
`render(track, bars)` produces the deterministic click timeline; `tests/schedule.rs` asserts the
timings — quarter-note spacing, subdivisions, swing 2/3:1/3, polymeter 5:4, accents/ghosts, mute,
multi-bar looping. All green on the host, no hardware. The real-time firmware loop will just play
this timeline against the wall clock.
**Also done:** the crate is now `#![no_std]` + `alloc` and **builds for the RP2350 target**
(`cargo build --lib --target thumbv8m.main-none-eabihf`) — the codec + scheduler are firmware-ready.
### Stage 3 — drivers (hardware) 🔧 IN PROGRESS (`rust/pm-kit/`)
**✅ Milestone 1 (boot) — confirmed on Pico 2:** GP25 blink. Toolchain + RP2350 boot block + flash work.
**🟡 Milestone 2 (display) — draws correctly on Pico 2, but TEARS:** ST7796 320×480 over SPI0 via
`rp235x-hal`, drawing the shared `pm-ui` through an `embedded-graphics` framebuffer. Driver is a
**custom `St7796` struct** ported from `pico/main.py` — mipidsi was tried first and **dropped**: its
split-transaction CS and orientation/offset math mangled the geometry; the port uses correct
per-command CS framing (CS low → cmd → params → CS high) and full-width row bands (see
[[rust-st7796-cs-gotcha]]). **Not done:** the image tears badly — no TE/vsync sync is possible because
this module exposes no TE pin, so writes race scan-out (no crate fixes this; see the tearing note in
the display driver matrix above). Open item before the display can be called finished. Diagnosed off-bench with host tools in `rust/uisim`: `uisim` renders
pm-ui to PNG; `--bin panelsim` decodes mipidsi's real command/pixel stream into a PNG (proved the
protocol correct → bug was physical); `--bin initdump` dumps the init + CASET/RASET sequence.
**🟡 Milestone 3 (live metronome) — built, pending on-device check:** the firmware is now an actual
metronome. `embedded-alloc` heap → parses tracks with `track-format` on-device; 4 built-in grooves;
Timer-driven clock; **audio clicks** on the master lane's hits (GP13 PWM, short edge-triggered
pulses, accent louder); **controls** — A = play/stop, B = grid/notation view, joystick (rotated 90°
CCW) up/down = tempo, left/right = groove. Renders `pm-ui::draw_metronome` / `draw_notation`, with a
cheap `draw_progress` strip animating the bar position every frame (full redraw only on change → no
flicker). All loop input reads use `unwrap_or` (no panics) — addresses the self-test crash.
Compile + simulator verified; **needs a flash to confirm** audio timing, joystick directions, no crash.
**pm-ui views (sim-verified, PNGs):** metronome grid (accents/ghosts/polymeter), and **drum notation**
(5-line staff, time sig, hands stem-up / feet stem-down, shared stems, beamed eighths, ledger lines).
**Still to do:** GT911 touch (GP8/9), WS2812 RGB (GP12), USB-MIDI, set-lists from programs.json,
per-cell live playhead, the rest of the practice features. Then split `pm-core` out as its own crate
and add `pm-explorer`/`pm-grid` binaries. HAL stays `rp235x-hal` (embassy later if async earns it).
On `embassy` / `rp-hal`:
- ST7789 240×320 display → `mipidsi` + `embedded-graphics` (mature; the parts are well-supported).
- I²S to the PCM5102A → RP2350 PIO.
- WS2812 → `ws2812-pio`. USB-MIDI → `usbd-midi` / `embassy-usb`.
- GT911 touch (Kit) over I²C.
#### `pm-grid` — Scroll Pack firmware 🟢 BUILT (LED + USB-MIDI audio), pending on-device check
The Rust sibling of `pico-scroll/app.py` (`rust/pm-grid/`), and **the firmware the PM_G-1 ships now**
the CircuitPython build is dropped from the product (info-grid.html / build.sh / deploy.sh no longer
bundle or serve it; source stays as the reference port). Target is a **plain RP2040** (Cortex-M0+,
`thumbv6m-none-eabi`) — NOT the Pico 2 — so it has its own HAL (`rp2040-hal` 0.10 + `rp2040-boot2`),
`.cargo/config.toml`, `memory.x` (BOOT2 + flash + 264 KB RAM) and `build.sh`+`uf2.py` (RP2040 family
id `0xe48bff56`). `thumbv6m-none-eabi` added to `rust/Containerfile`. Compiles clean → **`pm-grid.uf2`**
(BOOTSEL drag-flash) + **`pm-grid.elf`** (probe-rs + defmt). Both served by deploy.sh; the info page
links the `.uf2`. Kept out of the host workspace like `pm-kit`. Debug build uses `defmt`/`defmt-rtt` +
`panic-probe` + `flip-link`, runner `probe-rs run --chip RP2040` (the user's Pi Debug Probe).
What's implemented (faithful port of `pico-scroll`): the **IS31FL3731 driver** (vendored bulk
144-byte framebuffer, one I²C block write per frame — the right architecture, mirrors the
CircuitPython `Matrix`; per-pixel I²C is too slow to animate), the **polymeter scheduler** driven by
`track-format::schedule::lane_durs` (the cross-impl contract) with per-lane step clocks + ramp +
gap-trainer, **4-button input** (A tap=play/stop, hold=cycle view; B tap=next track, hold=next set
list; X/Y=tempo ∓ with auto-repeat), the **built-in set lists**, and three LED views:
- **Ticker** (default): a **beat strip** on the top row (cols 010) — faint ticks at each beat + a
bright playhead at the master lane's current step; the track name infinite-scrolls below it (cols
010, rows 26); BPM is pinned right, **rotated 90° CCW** — a vertical hundreds **dot-bar** in col
11 (one dot per 100) + the last two digits rotated into cols 1216 (tens bottom, units top). So
`130` → 1 dot + rotated "30". This is the user-designed landscape readout. Layout/rotation verified
off-bench with an ASCII replica of `draw_ticker`. Whole matrix strobes white on the downbeat.
- **Grid** (lanes×steps + playhead) and **Pendulum** (bouncing arm + beat ticks) — ports of
`_render_grid` / `_render_pendulum`.
Boot splash scrolls "PM-G1 GRID" (liveness + pixel-map check).
**Audio — USB-MIDI ✅ DONE** (the Scroll Pack has NO speaker, so this is the real audio path):
`usb-device` 0.3 + `usbd-midi` 0.5 (the `rp2040-hal` `UsbBus`). Enumerates as a class-compliant MIDI
device ("PM_G-1 Grid"); `tick` emits a **GM note-on per lane hit on channel 10** (note from the ported
`SOUND_GM` map, velocity by level 120/90/45) via `UsbMidiClass::send_bytes([0x09,0x99,note,vel])`
raw 4-byte packets, sidestepping the named-`Note` enum so arbitrary GM drum notes work. USB is polled
every loop iteration **and during the boot splash** (1.5 ms cadence) so the host can enumerate. Play
through the editor's **Device audio**.
**Live-sync — ✅ DONE** (`docs/livesync-protocol.md`, ported from `pico-scroll`): reads the USB-MIDI
RX endpoint, reassembles SysEx from the 4-byte event packets (by Code Index Number), and dispatches
manufacturer `0x7D` frames. **Version query** `0x02`→`0x03 "G;0.1.0"` (so the editor identifies it).
**HELLO** `0x40`→reply FULL; **FULL** `0x41`→parse the patch (`track-format::parse`) + running and
adopt it; **DELTA** `0x42`→apply `play`/`stop`/`bpm`/`sel`/`beat`; **BYE** `0x43`→disarm. **Broadcasts**
a DELTA from each on-device input (A/B/X/Y → play/stop, sel, bpm) and a **FULL heartbeat every ~5 s**
(`track-format::serialize`). Echo-guarded by a boot-derived origin; an `sync_applying` flag stops
re-broadcast while applying. All TX (notes + SysEx) shares the one-per-poll `tx_q` drain. `info!` logs
every received op. Structural `lane=` edits aren't applied incrementally (they arrive as a fresh FULL).
**Playback-flow auto-advance — ✅ DONE** (`rep`/`end`): at each master-bar boundary, after `bars*rep`
cycles the end-action fires — `end=stop` stops, `end=next`/`end=+N` advances. The next track is
**preloaded one bar early** (parsed + durs) into `pending`, then swapped at the exact seam
(`seam_ns` = the master lane's bar boundary; all lanes restart there) for a gapless handoff. A
`continue_on` flag (default off, no UI yet) would make a `bars` track with no `end=` auto-`next`.
**MIDI clock out — ✅ DONE** (default on): 24-PPQN `0xF8` against the wall clock + `0xFA`/`0xFC`
Start/Stop on play/stop (button or live-sync), so a DAW can slave its tempo to the Grid. Queued on
`tx_q` like everything else (CIN `0xF` single-byte packets).
**MIDI clock in — ✅ DONE** (default on): `feed_midi` handles `0xF8` (EMA of the inter-tick interval →
derived BPM, 5300 clamp + jitter reject), `0xFA`/`0xFB` start, `0xFC` stop. While slaved, the ramp
and our own clock-out are suppressed (no feedback); the lock drops after a >1 s gap. Only engages
when a host actually sends clock (the editor doesn't), so it's inert in normal editor use.
**USB Mass Storage — ✅ DONE (drive enumerates; pending on-device test)**: composite **MIDI + MSC**
(`usbd-storage` 2.0, SCSI over Bulk-Only), adapted from the crate's RP2040 example. The host sees a
**1 MB removable drive** backed by the upper flash (a `.filesystem` region, `NOLOAD` so it's not in
the UF2 and survives reflashes). `scsi_command` serves the SCSI set (Inquiry/ReadCapacity/Read/Write/
ModeSense/RequestSense); reads come from flash via raw pointer, writes erase+program a 4 KB sector
with `rp2040-flash` (wrapped in `interrupt::free`). The host owns the FAT format (formats on first
use). **Required `rp2040-hal` 0.11** (0.10 + `rp2040-flash` 0.6 = duplicate `__aeabi_*`/`__addsf3`
ROM intrinsics) and **`lto = false`** (fat-LTO tripped the same intrinsic). This **unblocks
persistence** — practice log / `settings.json` / user set-lists can now live on the drive (next: the
device parses the FAT to read/write them).
**Drive named "PM_G-1" + reads set lists — ✅ DONE**: on boot (before USB setup, so the flash write
can't disrupt enumeration) the device mounts the FAT (`fatfs` 0.4 git — 0.3.6 needs `core_io` for
no_std; reads via a `FlashIo` over the `.filesystem` region, validated off-bench against a real
`mkfs.fat` image). If the root-dir volume label isn't "PM_G-1" (e.g. a leftover CircuitPython
volume), it writes an embedded blank **PM_G-1 FAT12 template** (`src/fat_template.bin`, the first 7
sectors of `mkfs.fat -F12 -S4096 -n PM_G-1`, sets *both* BPB + root-dir VOLUME_ID label) → the drive
shows as **PM_G-1**. Then it reads `programs.json` (LFN) and a tolerant scanner turns it into **user
set lists appended to the built-ins** — drop your `programs.json` on the drive, reboot, your grooves
appear (B-hold cycles set lists). Set lists are now a runtime `Vec<SetList>` (built-ins → owned +
drive).
**Live re-read — ✅ DONE**: the SCSI Write handler sets a `dirty` flag; when the drive has been idle
~1.5 s (host finished) **and** playback is stopped, the loop re-reads `programs.json` and rebuilds the
set lists (`reload_user`) — drop a file, it appears **without a reboot**. Read-only → no FAT
corruption. (NB the boot black-screen regression was the 24 KB heap being too small for `fatfs` +
owned set lists → alloc panic; heap is now 96 KB and the drive read runs *after* the splash.)
**Dual-access constraint (why the device doesn't *write* the drive):** while the host has the FAT
mounted, the device writing it corrupts the host's cached view (same reason CircuitPython makes the
drive one-direction-at-a-time). So device→drive writes (practice log, `settings.json`) are **not**
done; the practice log should instead go to the editor via **LOGSYNC** (`0x45`, its designed
channel), or behind a CircuitPython-style boot-mode toggle. `settings.json` *read* (config from a
file: brightness, MIDI channel, clock on/off) is safe (device read-only) and is a clean follow-up.
**Still deferred**: practice log via **LOGSYNC** + **SLSYNC** (`0x44`/`0x45`), `settings.json` read,
show the set-list title, the **on-device 808/909 synth → USB Audio input** (the standalone-audio
alternative, big), firmware push (intended: UF2 now), optional piezo. A/B bootloader **dropped**.
Also pending: a **hardening pass** (stress the composite USB + flash-write timing; split `main.rs`).
### Stage 4 — native A/B + secure boot
Replace the `.mpy`-level A/B hack (`code.py` loads `app.mpy`, rolls back to `app.bak`) with the
**RP2350 bootrom's native** partition-table A/B + signed boot, configured via `picotool` (the
chip already provides this — see the earlier hardware discussion). The Rust app is the image in
the slot; rollback and version selection move into silicon.
## What you keep / lose
- **Gain:** memory safety, native A/B + secure boot, performance headroom, one typed model instead
of three hand-written parsers.
- **Lose:** the live one-click `.mpy` push (Rust is compile→flash→reboot). The editor's *data*
live-sync (tempo/pattern/setlist mirroring) still works — it's a data protocol. Only live
*logic* edits go away, and an embedded `wasm3`/script module could buy those back if wanted.
## Acceptance gate
Every codec/engine change must pass `tests/fixtures/track-format.json`. The Rust crate joins
`js`/`py` as a runner adapter, so "same groove on web, device, and the Rust build" is enforced,
not hoped for.
## Recommendation
Do **Stage 1 in a container next** — it's small, testable today (given a toolchain), reuses the
suite, and produces a real artifact to judge the Rust path on before committing to drivers or a
firmware rewrite. Defer Stages 34 until Stage 12 are green and you've decided the live-push
tradeoff is acceptable.