ZDDC/pandoc/README.md

# ZDDC Pandoc Tools

A collection of tools for converting Markdown documents to HTML with a professional viewer interface, optimized for technical documentation and engineering documents.

## Server-side conversion (`zddc-server`)

> The shell scripts in this folder are standalone CLI/batch tools. `zddc-server`
> implements its **own** on-demand conversion (Go package `zddc/internal/convert`)
> and does **not** call these scripts. It does, however, reuse the same
> `viewer-template.html` and `custom.css` (embedded at build time). See
> AGENTS.md → "Server-side document conversion" for the authoritative reference.

zddc-server can render any served `.md` on demand: requesting the sibling URL
`<path>/foo.docx` (or `.html` / `.pdf`) returns the converted bytes — no query
string. A real on-disk file of that name always wins; the virtual conversion
only fires when the requested file doesn't exist but `foo.md` does. The browse
app's markdown editor surfaces these as DOCX/HTML/PDF download links (auto-saving
a dirty buffer first so the output matches what's on screen).

**Architecture.** The Go code does the minimum — it `exec`s `pandoc` and
`chromium-browser` directly. The sandbox and resource caps live in the runtime
**image**, where `/usr/local/bin/{pandoc,chromium-browser}` are wrapper scripts
that run the real binary inside a per-conversion bubblewrap sandbox
(`--unshare-all`, read-only binds, `--tmpfs /tmp`, `--clearenv`) under cgroup v2
memory/PID caps. I/O is via stdin/stdout plus a per-call scratch dir. There is no
container runtime and no image pulling at request time.

The PDF flow is two-stage: pandoc renders the markdown through
`viewer-template.html` to standalone HTML, then headless Chromium prints that HTML
to PDF — preserving the viewer template's print-media CSS rather than going
through pandoc's LaTeX template.

Converted bytes are cached at `<dir>/.zddc.d/converted/<base>.<ext>` with mtime
synced to the source, so a fresh cache hit is a stat-and-serve with no `exec`.
A PUT/DELETE/MOVE on the source `.md` purges the sidecars. Per-project header
metadata (client/project/contractor/project_number) comes from the `.zddc`
`convert:` cascade; title/tracking_number/revision/status are derived from the
filename via `zddc.ParseFilename`.

Relevant flags (defaults in parens):

- `--convert-pandoc-binary` (`pandoc`) / `--convert-chromium-binary`
  (`chromium-browser`; `chromium` on Debian) — PATH-resolved name or absolute path
- `--convert-scratch-dir` (`$TMPDIR`) — host scratch root for template + intermediates
- `--convert-mem-mib` (`1024`) — per-conversion memory cap (cgroup `memory.max`)
- `--convert-pids` (`256`) — per-conversion PID cap (cgroup `pids.max`)
- `--convert-timeout` (`60s`) — per-conversion wall clock (Go `context.WithTimeout`)

If `pandoc`/`chromium` aren't on PATH (e.g. running zddc-server outside the runtime
image) the endpoint serves 503 with a `Retry-After`; the rest of the server keeps
working. Running against raw pandoc/chromium with no wrapper gives a working but
**unsandboxed** endpoint — fine for dev iteration.

## Features

### Document Conversion (`convert`)
- **Batch processing**: Convert multiple Markdown files at once
- **Force overwrite**: `-f` flag to overwrite existing output files
- **Custom output directory**: `-o` flag to specify output location
- **Configuration-driven**: Uses `zddc.conf` for project-specific settings
- **Template integration**: Automatically applies the viewer template
- **Progress tracking**: Real-time conversion status and summary

### Professional Viewer Template (`viewer-template.html`)
- **Modern responsive design**: Works on desktop, tablet, and mobile
- **Table of Contents (TOC)**: Auto-generated sidebar navigation with smooth scrolling
- **Print optimization**: Professional formatting for PDF generation
  - Page break controls for tables
  - Repeating table headers
  - Proper page numbering
  - Clean print layout
- **URL hash navigation**: Shareable links to specific document sections
- **Mobile-friendly**: Collapsible sidebar and touch-optimized interface
- **Professional styling**: Clean typography optimized for technical documents

## Usage

### Basic Conversion
```bash
# Convert all Markdown files in current directory
./convert *.md

# Convert with force overwrite
./convert -f *.md

# Convert to specific output directory
./convert -o rendered/ *.md

# Combine flags
./convert -f -o rendered/ *.md
```

### Configuration (`zddc.conf`)
Create a `zddc.conf` file in your project directory. It is **sourced as shell**,
so use `var="value"` syntax (no spaces around `=`). Only these four variables are
read; all are optional and feed the document header via pandoc `--variable`:
```sh
contractor="Contractor Name"   # contracting organization (header)
client="Client Name"           # client org (header, paired with project)
project="Project Name"         # full project name
project_number="AR 28088"      # shown in parentheses after the project name
```
The template path is discovered automatically (input dir → script dir →
symlink target) or set per-run with `-T`; the output directory is set with `-o`.
They are **not** `zddc.conf` keys.

### Directory Structure
```
your-project/
├── zddc.conf              # Configuration file
├── document1.md           # Source Markdown files
├── document2.md
└── rendered/              # Generated HTML files
    ├── document1.html
    └── document2.html
```

## Template Features

### Navigation
- **TOC Generation**: Automatically creates navigation from document headings
- **Smooth Scrolling**: Click TOC items for smooth navigation to sections
- **Hash URLs**: Address bar updates with section anchors for sharing
- **Mobile Menu**: Collapsible sidebar for mobile devices

### Print Styling
- **Page Breaks**: Tables won't split across pages
- **Header Repetition**: Table headers repeat on each page
- **Professional Layout**: Optimized margins and typography
- **Page Numbers**: Sequential page numbering in footer

### Responsive Design
- **Desktop**: Full sidebar with TOC always visible
- **Tablet**: Collapsible sidebar with overlay
- **Mobile**: Hamburger menu with full-screen TOC overlay

## Advanced Usage

### Custom Templates
You can customize the viewer template by:
1. Copying `viewer-template.html` to your project
2. Modifying the CSS and HTML structure
3. Updating `zddc.conf` to point to your custom template

### Batch Processing
For large document sets:
```bash
# Process all markdown files recursively
find . -name "*.md" -exec ./convert -f -o rendered/ {} +

# Process specific document types
./convert -f -o rendered/ *-SOW-*.md *-DBD-*.md
```

### Integration with Build Systems
The convert tool returns proper exit codes and can be integrated into CI/CD pipelines:
```bash
# In a build script
if ./convert -f -o dist/ *.md; then
    echo "Documentation built successfully"
else
    echo "Documentation build failed"
    exit 1
fi
```

## File Types Supported

- **Input**: Markdown (`.md`), DOCX (`.docx`), and HTML (`.html`/`.htm`) files
  (auto-detected: DOCX→MD, MD→HTML, HTML→MD; override with `-t md|html|docx`).
  Direct DOCX→HTML is not supported — convert to MD first.
- **Output**: HTML files with embedded CSS and JavaScript (plus MD and DOCX targets)
- **Images**: Supports embedded images and diagrams
- **Tables**: Full table support with print optimization
- **Code**: Syntax highlighting for code blocks

## Dependencies

- **pandoc**: Document conversion engine
- **Modern browser**: For viewing generated HTML files
- **Optional**: Web server for serving files (prevents CORS issues)

## Troubleshooting

### Common Issues
1. **Template not found**: Ensure `zddc.conf` points to correct template path
2. **Permission errors**: Make sure `convert` script is executable (`chmod +x convert`)
3. **Missing output**: Check that output directory exists or use `-o` to create it
4. **Print issues**: Use "Print to PDF" in browser for best results

### Performance
- Large documents (>1000 pages) may take longer to render
- Consider splitting very large documents into sections
- Use batch processing for multiple files

## Examples

### Engineering Documentation
Perfect for:
- Design basis documents
- Specifications and standards
- Project requirements
- Technical procedures
- Quality documentation

### Features Optimized For
- **Professional appearance**: Clean, corporate styling
- **Technical content**: Tables, diagrams, code blocks
- **Print output**: PDF generation with proper formatting
- **Navigation**: Easy browsing of long documents
- **Sharing**: URL fragments for referencing specific sections