fix(pandoc): correctness, robustness & doc cleanup of convert tools

Audit-driven cleanup of the standalone pandoc/ CLI tools (no changes to
the server's own zddc/internal/convert engine).

convert:
- DOCX→MD now reads lowercase client/project from zddc.conf (was $CLIENT/
  $PROJECT, always empty)
- ZDDC filename parsing via a shared parse_zddc_filename helper that
  extracts each field with its own backref, so a '|' in the title no
  longer truncates it (was cut -d'|')
- drop duplicate --section-divs and no-op --id-prefix=

convert-diff:
- replace hardcoded "(AR 28088)" in the diff header with the configured
  $project_number (omitted when unset)
- only pass --template when one was found (empty --template= errors out)
- drop the false "Loading ZDDC configuration" log and the sed quote-escape
  that leaked backslashes into custom_header
- remove dead REV_A/REV_B and rev*_date extraction; fix usage typo;
  pin LC_TIME=C on date calls

index.sh:
- relative_path passes paths to python via argv (no -c interpolation) and
  uses realpath --relative-to as the fallback instead of an absolute path
- escape '|' in title/status before emitting the markdown table row

README:
- rewrite the stale server-side section to match the real binary+bubblewrap
  design and flags/defaults (was a non-existent podman/docker/image design)
- fix the invalid zddc.conf example (sourced shell, four real vars) and the
  understated input-format list

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
ZDDC 2026-06-04 10:53:26 -05:00
parent 613092b30e
commit d10cd23076
4 changed files with 132 additions and 98 deletions

View file

@ -4,41 +4,52 @@ A collection of tools for converting Markdown documents to HTML with a professio
## Server-side conversion (`zddc-server`) ## Server-side conversion (`zddc-server`)
zddc-server can offer the same conversions on demand: a `.md` file in any > The shell scripts in this folder are standalone CLI/batch tools. `zddc-server`
served directory becomes downloadable as `.docx`, `.html`, and `.pdf` via the > implements its **own** on-demand conversion (Go package `zddc/internal/convert`)
`?convert=` query parameter, surfaced as Download buttons in the browse app's > and does **not** call these scripts. It does, however, reuse the same
markdown editor. > `viewer-template.html` and `custom.css` (embedded at build time). See
> AGENTS.md → "Server-side document conversion" for the authoritative reference.
The server shells out to two upstream container images, pulling each on zddc-server can render any served `.md` on demand: requesting the sibling URL
first use via `--pull=missing`. No custom image build is required — `<path>/foo.docx` (or `.html` / `.pdf`) returns the converted bytes — no query
operators just install `podman` (preferred) or `docker`, and the first string. A real on-disk file of that name always wins; the virtual conversion
conversion request pulls the image: only fires when the requested file doesn't exist but `foo.md` does. The browse
app's markdown editor surfaces these as DOCX/HTML/PDF download links (auto-saving
a dirty buffer first so the output matches what's on screen).
- `docker.io/pandoc/latex:latest` — MD → DOCX and MD → HTML **Architecture.** The Go code does the minimum — it `exec`s `pandoc` and
(override: `--convert-pandoc-image=` or `ZDDC_CONVERT_PANDOC_IMAGE`; `chromium-browser` directly. The sandbox and resource caps live in the runtime
switch to `docker.io/pandoc/core:latest` for a ~90% size reduction **image**, where `/usr/local/bin/{pandoc,chromium-browser}` are wrapper scripts
if you don't need pandoc's native LaTeX-PDF path) that run the real binary inside a per-conversion bubblewrap sandbox
- `docker.io/zenika/alpine-chrome:latest` — HTML → PDF (`--unshare-all`, read-only binds, `--tmpfs /tmp`, `--clearenv`) under cgroup v2
(override: `--convert-chromium-image=` or `ZDDC_CONVERT_CHROMIUM_IMAGE`) memory/PID caps. I/O is via stdin/stdout plus a per-call scratch dir. There is no
container runtime and no image pulling at request time.
The PDF flow is two-stage: pandoc renders the markdown through The PDF flow is two-stage: pandoc renders the markdown through
`viewer-template.html` to standalone HTML, then headless Chromium `viewer-template.html` to standalone HTML, then headless Chromium prints that HTML
prints that HTML to PDF. This preserves the existing print-media CSS to PDF — preserving the viewer template's print-media CSS rather than going
authored for the viewer template rather than going through pandoc's through pandoc's LaTeX template.
LaTeX template.
If neither podman nor docker is on PATH the endpoint serves 503 with Converted bytes are cached at `<dir>/.zddc.d/converted/<base>.<ext>` with mtime
a clear "no container runtime" message. Engine choice is overridable synced to the source, so a fresh cache hit is a stat-and-serve with no `exec`.
via `--convert-engine=` or `ZDDC_CONVERT_ENGINE`. A PUT/DELETE/MOVE on the source `.md` purges the sidecars. Per-project header
metadata (client/project/contractor/project_number) comes from the `.zddc`
`convert:` cascade; title/tracking_number/revision/status are derived from the
filename via `zddc.ParseFilename`.
Resource limits are per-container and configurable: `--convert-mem-mib` Relevant flags (defaults in parens):
(default 512), `--convert-cpus` (default "2"), `--convert-pids`
(default 100), `--convert-timeout` (default 30s).
Each conversion runs in a throw-away container with - `--convert-pandoc-binary` (`pandoc`) / `--convert-chromium-binary`
`--rm --network=none --read-only --tmpfs=/tmp --cap-drop=ALL (`chromium-browser`; `chromium` on Debian) — PATH-resolved name or absolute path
--security-opt=no-new-privileges` plus a bind-mounted scratch dir - `--convert-scratch-dir` (`$TMPDIR`) — host scratch root for template + intermediates
for I/O (read-only for the template; read-write for the PDF output). - `--convert-mem-mib` (`1024`) — per-conversion memory cap (cgroup `memory.max`)
- `--convert-pids` (`256`) — per-conversion PID cap (cgroup `pids.max`)
- `--convert-timeout` (`60s`) — per-conversion wall clock (Go `context.WithTimeout`)
If `pandoc`/`chromium` aren't on PATH (e.g. running zddc-server outside the runtime
image) the endpoint serves 503 with a `Retry-After`; the rest of the server keeps
working. Running against raw pandoc/chromium with no wrapper gives a working but
**unsandboxed** endpoint — fine for dev iteration.
## Features ## Features
@ -80,20 +91,18 @@ for I/O (read-only for the template; read-write for the PDF output).
``` ```
### Configuration (`zddc.conf`) ### Configuration (`zddc.conf`)
Create a `zddc.conf` file in your project directory: Create a `zddc.conf` file in your project directory. It is **sourced as shell**,
```ini so use `var="value"` syntax (no spaces around `=`). Only these four variables are
# Project metadata read; all are optional and feed the document header via pandoc `--variable`:
title = "Project Documentation" ```sh
author = "Your Organization" contractor="Contractor Name" # contracting organization (header)
date = "2024" client="Client Name" # client org (header, paired with project)
project="Project Name" # full project name
# Template settings project_number="AR 28088" # shown in parentheses after the project name
template = "/path/to/viewer-template.html"
css = "custom-styles.css"
# Output settings
output_dir = "rendered"
``` ```
The template path is discovered automatically (input dir → script dir →
symlink target) or set per-run with `-T`; the output directory is set with `-o`.
They are **not** `zddc.conf` keys.
### Directory Structure ### Directory Structure
``` ```
@ -157,8 +166,10 @@ fi
## File Types Supported ## File Types Supported
- **Input**: Markdown (`.md`) files with pandoc extensions - **Input**: Markdown (`.md`), DOCX (`.docx`), and HTML (`.html`/`.htm`) files
- **Output**: HTML files with embedded CSS and JavaScript (auto-detected: DOCX→MD, MD→HTML, HTML→MD; override with `-t md|html|docx`).
Direct DOCX→HTML is not supported — convert to MD first.
- **Output**: HTML files with embedded CSS and JavaScript (plus MD and DOCX targets)
- **Images**: Supports embedded images and diagrams - **Images**: Supports embedded images and diagrams
- **Tables**: Full table support with print optimization - **Tables**: Full table support with print optimization
- **Code**: Syntax highlighting for code blocks - **Code**: Syntax highlighting for code blocks

View file

@ -124,6 +124,23 @@ SUCCESSFUL=0
FAILED=0 FAILED=0
SKIPPED=0 SKIPPED=0
# Parse a ZDDC filename stem (no extension) into ZDDC_TRACKING / ZDDC_REVISION /
# ZDDC_STATUS / ZDDC_TITLE. Returns 0 on a full match, 1 otherwise.
# Each field is extracted with its own sed backref rather than a delimiter-joined
# string + cut, so a title containing the join character (e.g. '|') can't corrupt
# the split.
parse_zddc_filename() {
local stem="$1"
local sub='s/^\([^_]*\)_\([^ ]*\) *(\([^)]*\)) *- *\(.*\)$'
# Gate on a full match before extracting (empty fields are otherwise ambiguous).
printf '%s\n' "$stem" | grep -Eq '^[^_]+_[^ ]+ *\([^)]*\) *- *.+$' || return 1
ZDDC_TRACKING=$(printf '%s\n' "$stem" | sed -n "${sub}/\\1/p")
ZDDC_REVISION=$(printf '%s\n' "$stem" | sed -n "${sub}/\\2/p")
ZDDC_STATUS=$(printf '%s\n' "$stem" | sed -n "${sub}/\\3/p")
ZDDC_TITLE=$(printf '%s\n' "$stem" | sed -n "${sub}/\\4/p")
return 0
}
# Function to convert DOCX to Markdown # Function to convert DOCX to Markdown
convert_docx_to_md() { convert_docx_to_md() {
local INPUT="$1" local INPUT="$1"
@ -137,14 +154,12 @@ convert_docx_to_md() {
if pandoc -f docx -t gfm --markdown-headings=atx --extract-media="$MEDIA_DIR" --wrap=none --standalone "$INPUT" -o "$TEMP_FILE"; then if pandoc -f docx -t gfm --markdown-headings=atx --extract-media="$MEDIA_DIR" --wrap=none --standalone "$INPUT" -o "$TEMP_FILE"; then
# Parse ZDDC filename pattern: trackingNumber_revision (status) - title.extension # Parse ZDDC filename pattern: trackingNumber_revision (status) - title.extension
# Use sed to extract ZDDC components if parse_zddc_filename "$FILENAME_NO_EXT"; then
ZDDC_MATCH=$(echo "$FILENAME_NO_EXT" | sed -n 's/^\([^_]*\)_\([^ ]*\) *(\([^)]*\)) *- *\(.*\)$/\1|\2|\3|\4/p') TRACKING_NUMBER="$ZDDC_TRACKING"
if [ -n "$ZDDC_MATCH" ]; then REVISION="$ZDDC_REVISION"
TRACKING_NUMBER=$(echo "$ZDDC_MATCH" | cut -d'|' -f1) STATUS="$ZDDC_STATUS"
REVISION=$(echo "$ZDDC_MATCH" | cut -d'|' -f2) TITLE="$ZDDC_TITLE"
STATUS=$(echo "$ZDDC_MATCH" | cut -d'|' -f3)
TITLE=$(echo "$ZDDC_MATCH" | cut -d'|' -f4)
echo " → ZDDC metadata detected:" echo " → ZDDC metadata detected:"
echo " • Tracking: $TRACKING_NUMBER" echo " • Tracking: $TRACKING_NUMBER"
echo " • Revision: $REVISION" echo " • Revision: $REVISION"
@ -154,8 +169,8 @@ convert_docx_to_md() {
# Create YAML front matter and combine with content # Create YAML front matter and combine with content
{ {
echo "---" echo "---"
echo "client: \"${CLIENT:-}\"" echo "client: \"${client:-}\""
echo "project: \"${PROJECT:-}\"" echo "project: \"${project:-}\""
echo "tracking_number: \"$TRACKING_NUMBER\"" echo "tracking_number: \"$TRACKING_NUMBER\""
echo "revision: \"$REVISION\"" echo "revision: \"$REVISION\""
echo "status: \"$STATUS\"" echo "status: \"$STATUS\""
@ -293,8 +308,8 @@ convert_md_to_html() {
ORIGINAL_DIR=$(pwd) ORIGINAL_DIR=$(pwd)
cd "$INPUT_DIR" cd "$INPUT_DIR"
# Build pandoc command using positional arguments (安全方式,无 eval) # Build pandoc command as an argument array (safe form, no eval — each value
# 以空格分隔的参数数组,避免 shell 注入 # is a separate array element so it can't be re-split or injected by the shell).
PANDOC_ARGS=() PANDOC_ARGS=()
PANDOC_ARGS+=("--from" "markdown+yaml_metadata_block") PANDOC_ARGS+=("--from" "markdown+yaml_metadata_block")
PANDOC_ARGS+=("--standalone") PANDOC_ARGS+=("--standalone")
@ -315,13 +330,12 @@ convert_md_to_html() {
# Extract ZDDC metadata from filename for template variables # Extract ZDDC metadata from filename for template variables
FILENAME_NO_EXT=$(basename "$INPUT" .md) FILENAME_NO_EXT=$(basename "$INPUT" .md)
ZDDC_MATCH=$(echo "$FILENAME_NO_EXT" | sed -n 's/^\([^_]*\)_\([^ ]*\) *(\([^)]*\)) *- *\(.*\)$/\1|\2|\3|\4/p') if parse_zddc_filename "$FILENAME_NO_EXT"; then
if [ -n "$ZDDC_MATCH" ]; then TRACKING_NUMBER="$ZDDC_TRACKING"
TRACKING_NUMBER=$(echo "$ZDDC_MATCH" | cut -d'|' -f1) REVISION="$ZDDC_REVISION"
REVISION=$(echo "$ZDDC_MATCH" | cut -d'|' -f2) STATUS="$ZDDC_STATUS"
STATUS=$(echo "$ZDDC_MATCH" | cut -d'|' -f3) TITLE="$ZDDC_TITLE"
TITLE=$(echo "$ZDDC_MATCH" | cut -d'|' -f4)
# Pass ZDDC variables to template (each as separate args to avoid injection) # Pass ZDDC variables to template (each as separate args to avoid injection)
PANDOC_ARGS+=("--variable" "tracking_number=$TRACKING_NUMBER") PANDOC_ARGS+=("--variable" "tracking_number=$TRACKING_NUMBER")
PANDOC_ARGS+=("--variable" "revision=$REVISION") PANDOC_ARGS+=("--variable" "revision=$REVISION")
@ -357,11 +371,10 @@ convert_md_to_html() {
PANDOC_ARGS+=("--variable" "no-toc=true") PANDOC_ARGS+=("--variable" "no-toc=true")
fi fi
PANDOC_ARGS+=("--section-divs") # (--section-divs already added above)
PANDOC_ARGS+=("--id-prefix=")
PANDOC_ARGS+=("--html-q-tags") PANDOC_ARGS+=("--html-q-tags")
# Run pandoc with positional arguments (安全方式) # Run pandoc with positional arguments (safe form, no eval)
# All variables passed as separate arguments to avoid shell injection # All variables passed as separate arguments to avoid shell injection
if pandoc "$(basename "$INPUT_ABS")" -o "$OUTPUT_ABS" "${PANDOC_ARGS[@]}"; then if pandoc "$(basename "$INPUT_ABS")" -o "$OUTPUT_ABS" "${PANDOC_ARGS[@]}"; then

View file

@ -11,7 +11,7 @@ NO_TOC=false
show_help() { show_help() {
echo "Batch Markdown Diff Converter" echo "Batch Markdown Diff Converter"
echo "Compares pairs of markdown files and outputs HTML diffs using the same template as convert script" echo "Compares pairs of markdown files and outputs HTML diffs using the same template as convert script"
echo "Usage: $0 [-f] [-o outputdir] [-T template] [--no-toc] file1_rev_a.md file1_rev_b.md [file2_rev_a.md file1_rev_b.md ...]" echo "Usage: $0 [-f] [-o outputdir] [-T template] [--no-toc] file1_rev_a.md file1_rev_b.md [file2_rev_a.md file2_rev_b.md ...]"
echo " -f: Force overwrite existing output files" echo " -f: Force overwrite existing output files"
echo " -o: Output directory (default: same as first input file)" echo " -o: Output directory (default: same as first input file)"
echo " -T: Template file path (default: viewer-template.html)" echo " -T: Template file path (default: viewer-template.html)"
@ -350,11 +350,10 @@ while [ $# -gt 0 ]; do
fi fi
# Load ZDDC configuration from first file's directory # Load ZDDC configuration from first file's directory
# (load_zddc_config logs the path itself, but only when a config is found)
FILE1_DIR=$(dirname "$FILE1") FILE1_DIR=$(dirname "$FILE1")
load_zddc_config "$FILE1_DIR" load_zddc_config "$FILE1_DIR"
echo " → Loading ZDDC configuration from: $FILE1_DIR/zddc.conf"
# Determine template to use # Determine template to use
TEMPLATE_ABS="" TEMPLATE_ABS=""
if [ -n "$CUSTOM_TEMPLATE" ]; then if [ -n "$CUSTOM_TEMPLATE" ]; then
@ -423,11 +422,7 @@ while [ $# -gt 0 ]; do
echo " ✓ Diff generated successfully" echo " ✓ Diff generated successfully"
echo "Stage 2: Adding TOC and styling with pandoc..." echo "Stage 2: Adding TOC and styling with pandoc..."
# Extract revision info from filenames for metadata
REV_A=$(basename "$FILE1" .md | sed 's/.*_\([^_]*\)$/\1/')
REV_B=$(basename "$FILE2" .md | sed 's/.*_\([^_]*\)$/\1/')
# Extract metadata from both files (safe - no eval, uses heredoc) # Extract metadata from both files (safe - no eval, uses heredoc)
{ {
# Extract YAML frontmatter and parse fields safely # Extract YAML frontmatter and parse fields safely
@ -437,7 +432,6 @@ while [ $# -gt 0 ]; do
rev1_revision=$(grep '^revision:' "$TEMP_METADATA_REV1" | sed 's/^revision: *"\(.*\)"$/\1/' | head -1) rev1_revision=$(grep '^revision:' "$TEMP_METADATA_REV1" | sed 's/^revision: *"\(.*\)"$/\1/' | head -1)
rev1_status=$(grep '^status:' "$TEMP_METADATA_REV1" | sed 's/^status: *"\(.*\)"$/\1/' | head -1) rev1_status=$(grep '^status:' "$TEMP_METADATA_REV1" | sed 's/^status: *"\(.*\)"$/\1/' | head -1)
rev1_project=$(grep '^project:' "$TEMP_METADATA_REV1" | sed 's/^project: *"\(.*\)"$/\1/' | head -1) rev1_project=$(grep '^project:' "$TEMP_METADATA_REV1" | sed 's/^project: *"\(.*\)"$/\1/' | head -1)
rev1_date=$(grep '^date:' "$TEMP_METADATA_REV1" | sed 's/^date: *"\(.*\)"$/\1/' | head -1)
} }
{ {
awk '/^---$/{if(NR==1){p=1}else{p=0}} p && !/^---$/{print}' "$FILE2" > "$TEMP_METADATA_REV2" awk '/^---$/{if(NR==1){p=1}else{p=0}} p && !/^---$/{print}' "$FILE2" > "$TEMP_METADATA_REV2"
@ -446,7 +440,6 @@ while [ $# -gt 0 ]; do
rev2_revision=$(grep '^revision:' "$TEMP_METADATA_REV2" | sed 's/^revision: *"\(.*\)"$/\1/' | head -1) rev2_revision=$(grep '^revision:' "$TEMP_METADATA_REV2" | sed 's/^revision: *"\(.*\)"$/\1/' | head -1)
rev2_status=$(grep '^status:' "$TEMP_METADATA_REV2" | sed 's/^status: *"\(.*\)"$/\1/' | head -1) rev2_status=$(grep '^status:' "$TEMP_METADATA_REV2" | sed 's/^status: *"\(.*\)"$/\1/' | head -1)
rev2_project=$(grep '^project:' "$TEMP_METADATA_REV2" | sed 's/^project: *"\(.*\)"$/\1/' | head -1) rev2_project=$(grep '^project:' "$TEMP_METADATA_REV2" | sed 's/^project: *"\(.*\)"$/\1/' | head -1)
rev2_date=$(grep '^date:' "$TEMP_METADATA_REV2" | sed 's/^date: *"\(.*\)"$/\1/' | head -1)
} }
# Clean up metadata temp files # Clean up metadata temp files
@ -456,8 +449,9 @@ while [ $# -gt 0 ]; do
generate_diff_header() { generate_diff_header() {
local header_html="" local header_html=""
# Project title (should be same for both) # Project title (should be same for both). Append the project number from
header_html="<div class=\"header-line client-project\">$rev2_project (AR 28088)</div>" # zddc.conf when set, e.g. "Project Name (AR 28088)"; omit the parens otherwise.
header_html="<div class=\"header-line client-project\">${rev2_project}${project_number:+ ($project_number)}</div>"
# Document title with diff # Document title with diff
if [ "$rev1_title" != "$rev2_title" ]; then if [ "$rev1_title" != "$rev2_title" ]; then
@ -490,7 +484,7 @@ while [ $# -gt 0 ]; do
# Add draft marker if revision contains ~ # Add draft marker if revision contains ~
if echo "$rev2_revision" | grep -q "~"; then if echo "$rev2_revision" | grep -q "~"; then
header_html="$header_html<div class=\"header-line metadata-line draft-line\"><span class=\"draft-status\">[DRAFT Generated at $(date '+%B %d, %Y at %I:%M:%S %p %Z')]</span></div>" header_html="$header_html<div class=\"header-line metadata-line draft-line\"><span class=\"draft-status\">[DRAFT Generated at $(LC_TIME=C date '+%B %d, %Y at %I:%M:%S %p %Z')]</span></div>"
fi fi
echo "$header_html" echo "$header_html"
@ -498,23 +492,29 @@ while [ $# -gt 0 ]; do
DIFF_HEADER_HTML=$(generate_diff_header) DIFF_HEADER_HTML=$(generate_diff_header)
# Generate timestamp for conversion # Generate timestamp for conversion (force English locale, matching convert)
GENERATION_TIME=$(date '+%B %d, %Y at %I:%M:%S %p %Z') GENERATION_TIME=$(LC_TIME=C date '+%B %d, %Y at %I:%M:%S %p %Z')
# Set resource path to second file directory for resource resolution # Set resource path to second file directory for resource resolution
FILE2_DIR=$(dirname "$FILE2") FILE2_DIR=$(dirname "$FILE2")
# Escape HTML for safe shell usage # Build pandoc command as array (not string with eval). Header HTML is passed
ESCAPED_HEADER_HTML=$(printf '%s' "$DIFF_HEADER_HTML" | sed 's/"/\\"/g') # as a single array element below, so no shell escaping is needed — escaping the
# quotes here would leak backslashes into the rendered output.
# Build pandoc command as array (not string with eval)
PANDOC_ARGS=( PANDOC_ARGS=(
"pandoc" "$TEMP_DIFF" "-o" "$OUTPUT_FILE" "pandoc" "$TEMP_DIFF" "-o" "$OUTPUT_FILE"
"--from" "html" "--from" "html"
"--standalone" "--standalone"
"--template=$TEMPLATE_ABS"
) )
# Only pass --template when one was actually found; pandoc errors on an empty
# --template= value, so fall back to its default template otherwise.
if [ -n "$TEMPLATE_ABS" ]; then
PANDOC_ARGS+=("--template=$TEMPLATE_ABS")
else
echo " ⚠ Warning: viewer-template.html not found, using pandoc default template"
fi
# Add TOC args if not disabled # Add TOC args if not disabled
if [ "$NO_TOC" != "true" ]; then if [ "$NO_TOC" != "true" ]; then
PANDOC_ARGS+=("--toc" "--toc-depth=3") PANDOC_ARGS+=("--toc" "--toc-depth=3")
@ -526,7 +526,7 @@ while [ $# -gt 0 ]; do
"--metadata" "title=$rev2_title" "--metadata" "title=$rev2_title"
"--metadata" "generation_time=$GENERATION_TIME" "--metadata" "generation_time=$GENERATION_TIME"
"--metadata" "diff_mode=true" "--metadata" "diff_mode=true"
"--metadata" "custom_header=$ESCAPED_HEADER_HTML" "--metadata" "custom_header=$DIFF_HEADER_HTML"
) )
# Add ZDDC configuration variables from zddc.conf (only once) # Add ZDDC configuration variables from zddc.conf (only once)
@ -548,7 +548,7 @@ while [ $# -gt 0 ]; do
PANDOC_ARGS+=("--variable" "no-toc=true") PANDOC_ARGS+=("--variable" "no-toc=true")
fi fi
PANDOC_ARGS+=("--section-divs" "--id-prefix=" "--html-q-tags") PANDOC_ARGS+=("--section-divs" "--html-q-tags")
# Execute pandoc via array (no eval) # Execute pandoc via array (no eval)
if "${PANDOC_ARGS[@]}"; then if "${PANDOC_ARGS[@]}"; then

View file

@ -59,15 +59,21 @@ done
mkdir -p "$OUTPUT_DIR" mkdir -p "$OUTPUT_DIR"
# Function to get relative path from $1 (base dir) to $2 (target path) # Function to get relative path from $1 (base dir) to $2 (target path)
# Uses Python for portability (works on both GNU and BSD systems) # Prefers python3 for portability (works on both GNU and BSD systems). Paths are
# passed as argv, not interpolated into the -c source, so quotes/specials in a
# path can't break or inject into the Python snippet.
relative_path() { relative_path() {
local base_dir="$1" local base_dir="$1"
local target_path="$2" local target_path="$2"
if command -v python3 >/dev/null 2>&1; then if command -v python3 >/dev/null 2>&1; then
python3 -c "import os; print(os.path.relpath('$target_path', '$base_dir'))" python3 -c 'import os, sys; print(os.path.relpath(sys.argv[1], sys.argv[2]))' \
"$target_path" "$base_dir"
elif realpath --relative-to=/ / >/dev/null 2>&1; then
# GNU realpath supports --relative-to; keep symlink targets relative.
realpath --relative-to="$base_dir" "$target_path"
else else
# Fallback: use absolute paths if python3 not available # Last resort: absolute path (still a valid symlink target, just not relative).
realpath "$target_path" realpath "$target_path"
fi fi
} }
@ -265,9 +271,13 @@ EOF
# Create truncated SHA256 for display # Create truncated SHA256 for display
sha256_short="${sha256:0:6}...${sha256: -6}" sha256_short="${sha256:0:6}...${sha256: -6}"
# Escape pipe chars so a title/status containing '|' can't break the table row
md_title=$(printf '%s' "$doc_title" | sed 's/|/\\|/g')
md_status=$(printf '%s' "$status" | sed 's/|/\\|/g')
# Add to markdown table # Add to markdown table
echo "| $row_counter | $tracking_link | $doc_title | $revision_link | $status | <span class=\"sha256\" title=\"$sha256\">$sha256_short</span> |" >> "$index_md_file" echo "| $row_counter | $tracking_link | $md_title | $revision_link | $md_status | <span class=\"sha256\" title=\"$sha256\">$sha256_short</span> |" >> "$index_md_file"
echo " $filename -> symlinks created" echo " $filename -> symlinks created"
done < <(find "$folder" -maxdepth 1 \( -type f -o -type l \) -print0) done < <(find "$folder" -maxdepth 1 \( -type f -o -type l \) -print0)