ZDDC — Zero Day Document Control. A file-naming convention plus five single-file HTML tools (archive, transmittal, classifier, mdedit, landing) and an optional Go HTTP server (zddc-server) with ACL and a virtual archive index. Self-contained, offline-capable, dependency-free. See README.md for an overview, AGENTS.md and ARCHITECTURE.md for the build/release/architecture detail, bootstrap/README.md for the two-level deployment install pattern, and zddc/README.md for the HTTP server.
167 lines
4.9 KiB
Markdown
167 lines
4.9 KiB
Markdown
# ZDDC Training Data Pipeline
|
||
|
||
Adaptive LoRA fine-tuning for **Qwen3-Coder-Next** (`vllm/RedHatAI/Qwen3-Coder-Next-NVFP4`).
|
||
|
||
Training data is collected reactively: when Qwen struggles on a task and a stronger model (Sonnet/Opus) is consulted, that interaction is captured as a training example. Over time, domain-specific LoRA adapters are trained to patch Qwen's weak spots.
|
||
|
||
---
|
||
|
||
## Directory Layout
|
||
|
||
```
|
||
training-data/
|
||
├── raw/
|
||
│ └── interactions.jsonl # All captured weak-spot interactions
|
||
├── processed/
|
||
│ ├── all.jsonl # Deduplicated, combined dataset
|
||
│ ├── multi-domain.jsonl # All domains merged
|
||
│ └── <domain>.jsonl # Per-domain splits (auto-generated)
|
||
├── validation/
|
||
│ ├── train.jsonl # 80% split
|
||
│ ├── val.jsonl # 10% split
|
||
│ └── test.jsonl # 10% split (never used during training)
|
||
├── adapters/
|
||
│ └── <domain>-lora-v1/ # Trained LoRA adapter
|
||
│ └── <domain>-lora-v1-merged/ # Merged standalone model
|
||
├── snapshots/
|
||
│ └── v<date>/ # Versioned dataset snapshots
|
||
├── collect-interaction.js # Capture a weak-spot interaction
|
||
├── process.sh # Cluster raw data into domain splits
|
||
├── validate.sh # Check data quality before training
|
||
├── train.sh # Train a LoRA adapter
|
||
└── deploy.sh # Merge adapter into standalone model
|
||
```
|
||
|
||
---
|
||
|
||
## Workflow
|
||
|
||
### Step 1 — Collect a weak-spot interaction
|
||
|
||
When Qwen gets stuck or you ask it to consult Sonnet/Opus:
|
||
|
||
```bash
|
||
node collect-interaction.js \
|
||
--query "How do I parse ZDDC filenames?" \
|
||
--qwen "[Qwen's suboptimal answer]" \
|
||
--expert "[Sonnet's correct answer]"
|
||
```
|
||
|
||
Optionally specify domain explicitly (otherwise auto-detected):
|
||
|
||
```bash
|
||
node collect-interaction.js \
|
||
--query "..." \
|
||
--qwen "..." \
|
||
--expert "..." \
|
||
--domain zddc-naming
|
||
```
|
||
|
||
Raw interaction is appended to `raw/interactions.jsonl`.
|
||
|
||
### Step 2 — Process (after ~50 new interactions)
|
||
|
||
```bash
|
||
bash process.sh
|
||
```
|
||
|
||
Deduplicates, clusters by domain, creates train/val/test splits.
|
||
|
||
### Step 3 — Validate
|
||
|
||
```bash
|
||
bash validate.sh
|
||
```
|
||
|
||
Checks JSONL validity, domain balance, and split sizes.
|
||
|
||
### Step 4 — Train
|
||
|
||
```bash
|
||
bash train.sh # train multi-domain adapter
|
||
bash train.sh zddc-naming # train domain-specific adapter
|
||
```
|
||
|
||
Outputs LoRA adapter to `adapters/<domain>-lora-v1/`.
|
||
|
||
### Step 5 — Deploy (optional)
|
||
|
||
```bash
|
||
bash deploy.sh # merge multi-domain adapter
|
||
bash deploy.sh zddc-naming # merge specific adapter
|
||
```
|
||
|
||
Merges the LoRA weights into the base model and saves a standalone model.
|
||
|
||
---
|
||
|
||
## Training Data Format
|
||
|
||
Each line in a `.jsonl` file is one training example:
|
||
|
||
```json
|
||
{
|
||
"messages": [
|
||
{"role": "user", "content": "Query that exposed weakness"},
|
||
{"role": "assistant", "content": "Qwen's original response"},
|
||
{"role": "user", "content": "consult Sonnet"},
|
||
{"role": "assistant", "content": "Expert's correct response"}
|
||
],
|
||
"metadata": {
|
||
"domain": "zddc-naming",
|
||
"adapter": "lora-v1-zddc_naming",
|
||
"timestamp": "2025-10-31T14:30:00.000Z",
|
||
"interaction_id": "int_1735648200000_abc123",
|
||
"source": "manual-expert-consultation"
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## Auto-Detected Domains
|
||
|
||
| Domain | Trigger keywords |
|
||
|--------|----------------|
|
||
| `zddc-naming` | zddc, trackingnumber, revision, status code |
|
||
| `html-architecture` | html, spa, single-file, es module, vanilla js |
|
||
| `build-system` | build.sh, dist/, template.html |
|
||
| `coding-debugging` | debug, error, fix, console |
|
||
| `reasoning-architecture` | reason, analyze, architecture, design |
|
||
| `general-coding` | (default) |
|
||
|
||
---
|
||
|
||
## LoRA Configuration
|
||
|
||
| Parameter | Value | Notes |
|
||
|-----------|-------|-------|
|
||
| Base model | `Qwen/Qwen2.5-7B-Instruct` | Replace with Qwen3 when available on HF |
|
||
| Rank | 64 | Increase to 128 if underfitting |
|
||
| Alpha | 64 | 1:1 with rank |
|
||
| Target modules | q_proj, v_proj, k_proj, o_proj | All attention projections |
|
||
| Dropout | 0.05 | Light regularisation |
|
||
| Learning rate | 1e-4 | Cosine decay with 10% warmup |
|
||
| Epochs | 3 | Monitor val loss to catch overfitting |
|
||
| Batch size | 8 effective | 4 per-device × 2 gradient accumulation |
|
||
| Precision | bfloat16 | Requires Ampere GPU or newer |
|
||
|
||
---
|
||
|
||
## Hardware Requirements
|
||
|
||
| Setup | Min VRAM | Method | Notes |
|
||
|-------|----------|--------|-------|
|
||
| Qwen-7B LoRA | 24 GB | LoRA bf16 | Recommended |
|
||
| Qwen-7B QLoRA | 16 GB | QLoRA 4-bit | Add `--load_in_4bit` flag |
|
||
| Qwen-14B LoRA | 48 GB | LoRA bf16 | Better quality |
|
||
|
||
Your system has 96 GB VRAM — full LoRA on Qwen-14B is feasible.
|
||
|
||
---
|
||
|
||
## When to Retrain
|
||
|
||
- Every **50–100 new interactions** collected
|
||
- When a new domain accumulates **200+ examples**
|
||
- After a major project phase where Qwen struggled repeatedly
|