Audit-driven cleanup of the standalone pandoc/ CLI tools (no changes to the server's own zddc/internal/convert engine). convert: - DOCX→MD now reads lowercase client/project from zddc.conf (was $CLIENT/ $PROJECT, always empty) - ZDDC filename parsing via a shared parse_zddc_filename helper that extracts each field with its own backref, so a '|' in the title no longer truncates it (was cut -d'|') - drop duplicate --section-divs and no-op --id-prefix= convert-diff: - replace hardcoded "(AR 28088)" in the diff header with the configured $project_number (omitted when unset) - only pass --template when one was found (empty --template= errors out) - drop the false "Loading ZDDC configuration" log and the sed quote-escape that leaked backslashes into custom_header - remove dead REV_A/REV_B and rev*_date extraction; fix usage typo; pin LC_TIME=C on date calls index.sh: - relative_path passes paths to python via argv (no -c interpolation) and uses realpath --relative-to as the fallback instead of an absolute path - escape '|' in title/status before emitting the markdown table row README: - rewrite the stale server-side section to match the real binary+bubblewrap design and flags/defaults (was a non-existent podman/docker/image design) - fix the invalid zddc.conf example (sourced shell, four real vars) and the understated input-format list Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
211 lines
8.5 KiB
Markdown
211 lines
8.5 KiB
Markdown
# ZDDC Pandoc Tools
|
|
|
|
A collection of tools for converting Markdown documents to HTML with a professional viewer interface, optimized for technical documentation and engineering documents.
|
|
|
|
## Server-side conversion (`zddc-server`)
|
|
|
|
> The shell scripts in this folder are standalone CLI/batch tools. `zddc-server`
|
|
> implements its **own** on-demand conversion (Go package `zddc/internal/convert`)
|
|
> and does **not** call these scripts. It does, however, reuse the same
|
|
> `viewer-template.html` and `custom.css` (embedded at build time). See
|
|
> AGENTS.md → "Server-side document conversion" for the authoritative reference.
|
|
|
|
zddc-server can render any served `.md` on demand: requesting the sibling URL
|
|
`<path>/foo.docx` (or `.html` / `.pdf`) returns the converted bytes — no query
|
|
string. A real on-disk file of that name always wins; the virtual conversion
|
|
only fires when the requested file doesn't exist but `foo.md` does. The browse
|
|
app's markdown editor surfaces these as DOCX/HTML/PDF download links (auto-saving
|
|
a dirty buffer first so the output matches what's on screen).
|
|
|
|
**Architecture.** The Go code does the minimum — it `exec`s `pandoc` and
|
|
`chromium-browser` directly. The sandbox and resource caps live in the runtime
|
|
**image**, where `/usr/local/bin/{pandoc,chromium-browser}` are wrapper scripts
|
|
that run the real binary inside a per-conversion bubblewrap sandbox
|
|
(`--unshare-all`, read-only binds, `--tmpfs /tmp`, `--clearenv`) under cgroup v2
|
|
memory/PID caps. I/O is via stdin/stdout plus a per-call scratch dir. There is no
|
|
container runtime and no image pulling at request time.
|
|
|
|
The PDF flow is two-stage: pandoc renders the markdown through
|
|
`viewer-template.html` to standalone HTML, then headless Chromium prints that HTML
|
|
to PDF — preserving the viewer template's print-media CSS rather than going
|
|
through pandoc's LaTeX template.
|
|
|
|
Converted bytes are cached at `<dir>/.zddc.d/converted/<base>.<ext>` with mtime
|
|
synced to the source, so a fresh cache hit is a stat-and-serve with no `exec`.
|
|
A PUT/DELETE/MOVE on the source `.md` purges the sidecars. Per-project header
|
|
metadata (client/project/contractor/project_number) comes from the `.zddc`
|
|
`convert:` cascade; title/tracking_number/revision/status are derived from the
|
|
filename via `zddc.ParseFilename`.
|
|
|
|
Relevant flags (defaults in parens):
|
|
|
|
- `--convert-pandoc-binary` (`pandoc`) / `--convert-chromium-binary`
|
|
(`chromium-browser`; `chromium` on Debian) — PATH-resolved name or absolute path
|
|
- `--convert-scratch-dir` (`$TMPDIR`) — host scratch root for template + intermediates
|
|
- `--convert-mem-mib` (`1024`) — per-conversion memory cap (cgroup `memory.max`)
|
|
- `--convert-pids` (`256`) — per-conversion PID cap (cgroup `pids.max`)
|
|
- `--convert-timeout` (`60s`) — per-conversion wall clock (Go `context.WithTimeout`)
|
|
|
|
If `pandoc`/`chromium` aren't on PATH (e.g. running zddc-server outside the runtime
|
|
image) the endpoint serves 503 with a `Retry-After`; the rest of the server keeps
|
|
working. Running against raw pandoc/chromium with no wrapper gives a working but
|
|
**unsandboxed** endpoint — fine for dev iteration.
|
|
|
|
## Features
|
|
|
|
### Document Conversion (`convert`)
|
|
- **Batch processing**: Convert multiple Markdown files at once
|
|
- **Force overwrite**: `-f` flag to overwrite existing output files
|
|
- **Custom output directory**: `-o` flag to specify output location
|
|
- **Configuration-driven**: Uses `zddc.conf` for project-specific settings
|
|
- **Template integration**: Automatically applies the viewer template
|
|
- **Progress tracking**: Real-time conversion status and summary
|
|
|
|
### Professional Viewer Template (`viewer-template.html`)
|
|
- **Modern responsive design**: Works on desktop, tablet, and mobile
|
|
- **Table of Contents (TOC)**: Auto-generated sidebar navigation with smooth scrolling
|
|
- **Print optimization**: Professional formatting for PDF generation
|
|
- Page break controls for tables
|
|
- Repeating table headers
|
|
- Proper page numbering
|
|
- Clean print layout
|
|
- **URL hash navigation**: Shareable links to specific document sections
|
|
- **Mobile-friendly**: Collapsible sidebar and touch-optimized interface
|
|
- **Professional styling**: Clean typography optimized for technical documents
|
|
|
|
## Usage
|
|
|
|
### Basic Conversion
|
|
```bash
|
|
# Convert all Markdown files in current directory
|
|
./convert *.md
|
|
|
|
# Convert with force overwrite
|
|
./convert -f *.md
|
|
|
|
# Convert to specific output directory
|
|
./convert -o rendered/ *.md
|
|
|
|
# Combine flags
|
|
./convert -f -o rendered/ *.md
|
|
```
|
|
|
|
### Configuration (`zddc.conf`)
|
|
Create a `zddc.conf` file in your project directory. It is **sourced as shell**,
|
|
so use `var="value"` syntax (no spaces around `=`). Only these four variables are
|
|
read; all are optional and feed the document header via pandoc `--variable`:
|
|
```sh
|
|
contractor="Contractor Name" # contracting organization (header)
|
|
client="Client Name" # client org (header, paired with project)
|
|
project="Project Name" # full project name
|
|
project_number="AR 28088" # shown in parentheses after the project name
|
|
```
|
|
The template path is discovered automatically (input dir → script dir →
|
|
symlink target) or set per-run with `-T`; the output directory is set with `-o`.
|
|
They are **not** `zddc.conf` keys.
|
|
|
|
### Directory Structure
|
|
```
|
|
your-project/
|
|
├── zddc.conf # Configuration file
|
|
├── document1.md # Source Markdown files
|
|
├── document2.md
|
|
└── rendered/ # Generated HTML files
|
|
├── document1.html
|
|
└── document2.html
|
|
```
|
|
|
|
## Template Features
|
|
|
|
### Navigation
|
|
- **TOC Generation**: Automatically creates navigation from document headings
|
|
- **Smooth Scrolling**: Click TOC items for smooth navigation to sections
|
|
- **Hash URLs**: Address bar updates with section anchors for sharing
|
|
- **Mobile Menu**: Collapsible sidebar for mobile devices
|
|
|
|
### Print Styling
|
|
- **Page Breaks**: Tables won't split across pages
|
|
- **Header Repetition**: Table headers repeat on each page
|
|
- **Professional Layout**: Optimized margins and typography
|
|
- **Page Numbers**: Sequential page numbering in footer
|
|
|
|
### Responsive Design
|
|
- **Desktop**: Full sidebar with TOC always visible
|
|
- **Tablet**: Collapsible sidebar with overlay
|
|
- **Mobile**: Hamburger menu with full-screen TOC overlay
|
|
|
|
## Advanced Usage
|
|
|
|
### Custom Templates
|
|
You can customize the viewer template by:
|
|
1. Copying `viewer-template.html` to your project
|
|
2. Modifying the CSS and HTML structure
|
|
3. Updating `zddc.conf` to point to your custom template
|
|
|
|
### Batch Processing
|
|
For large document sets:
|
|
```bash
|
|
# Process all markdown files recursively
|
|
find . -name "*.md" -exec ./convert -f -o rendered/ {} +
|
|
|
|
# Process specific document types
|
|
./convert -f -o rendered/ *-SOW-*.md *-DBD-*.md
|
|
```
|
|
|
|
### Integration with Build Systems
|
|
The convert tool returns proper exit codes and can be integrated into CI/CD pipelines:
|
|
```bash
|
|
# In a build script
|
|
if ./convert -f -o dist/ *.md; then
|
|
echo "Documentation built successfully"
|
|
else
|
|
echo "Documentation build failed"
|
|
exit 1
|
|
fi
|
|
```
|
|
|
|
## File Types Supported
|
|
|
|
- **Input**: Markdown (`.md`), DOCX (`.docx`), and HTML (`.html`/`.htm`) files
|
|
(auto-detected: DOCX→MD, MD→HTML, HTML→MD; override with `-t md|html|docx`).
|
|
Direct DOCX→HTML is not supported — convert to MD first.
|
|
- **Output**: HTML files with embedded CSS and JavaScript (plus MD and DOCX targets)
|
|
- **Images**: Supports embedded images and diagrams
|
|
- **Tables**: Full table support with print optimization
|
|
- **Code**: Syntax highlighting for code blocks
|
|
|
|
## Dependencies
|
|
|
|
- **pandoc**: Document conversion engine
|
|
- **Modern browser**: For viewing generated HTML files
|
|
- **Optional**: Web server for serving files (prevents CORS issues)
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
1. **Template not found**: Ensure `zddc.conf` points to correct template path
|
|
2. **Permission errors**: Make sure `convert` script is executable (`chmod +x convert`)
|
|
3. **Missing output**: Check that output directory exists or use `-o` to create it
|
|
4. **Print issues**: Use "Print to PDF" in browser for best results
|
|
|
|
### Performance
|
|
- Large documents (>1000 pages) may take longer to render
|
|
- Consider splitting very large documents into sections
|
|
- Use batch processing for multiple files
|
|
|
|
## Examples
|
|
|
|
### Engineering Documentation
|
|
Perfect for:
|
|
- Design basis documents
|
|
- Specifications and standards
|
|
- Project requirements
|
|
- Technical procedures
|
|
- Quality documentation
|
|
|
|
### Features Optimized For
|
|
- **Professional appearance**: Clean, corporate styling
|
|
- **Technical content**: Tables, diagrams, code blocks
|
|
- **Print output**: PDF generation with proper formatting
|
|
- **Navigation**: Easy browsing of long documents
|
|
- **Sharing**: URL fragments for referencing specific sections
|