✅
1,087
Records Accepted
↑ +42 today
🔍
42
Review Queue
Pending review
🤖
3
Trained Models
Gen + Cls + VLM
⚡
2.1 GB
VRAM Used
5.9 GB free
📈 Last Run — PDI Generation
Completed
Ph1
Extract
847 chunks
Ph2A
Enrich
5-level ctx
Ph2B
Multimodal
Skipped
Ph3A
Generate
1200 cands
Ph3C
Validate
1087 OK
Acceptance Rate90.6%
🎯 Validator Chain — Last Batch
01
Schema
1200
02
Distrib
1198
03
Dedupe
1151
04
Ground
1129
05
Novelty
1087
Rejected by stage: Schema −2 · Dedupe −47 · Grounding −22 · Novelty −42
💾 Model Registry
| Model | Role | Status | VRAM |
|---|---|---|---|
| mistral-7b | Generator | Trained | 4.1 GB |
| qwen2-1.5b | Classifier | Trained | 0.9 GB |
| llama-3.2-11b | VLM | Available | 6.5 GB |
| florence-2 | Captioner | Available | 0.3 GB |
📋 Recent Activity
14:22:01 [OK] PDI run complete: 1087 accepted, 42 review queue
14:21:58 [INFO] Phase 3C: Novelty validator pass (1087/1129)
14:18:33 [INFO] Phase 3A: GPRO RL complete — 1200 records generated
14:05:12 [INFO] Phase 2B: CLIP unloaded (freed 2.1 GB VRAM)
13:58:44 [INFO] Phase 1: Extracted 847 chunks from 12 PDFs
13:55:00 [OK] DDI training complete: Generator+Classifier trained
⚙️ DDI Stage Control
1 · Generator SFT
Mistral-7B + LoRA r=16 α=32 — Supervised fine-tuning on MCQ data
2 · Reward Models
CLIP scorer + Safety scorer + Text quality scorer → ensemble
3 · Generator RL (GPRO)
GPRO-Hybrid K=4 · α=0.7 · γ=0.3 · ε=0.2
4 · Classifier SFT
Qwen2-1.5B + LoRA r=8 α=16 — Difficulty/layer classifier
5 · VLM SFT (optional)
Llama-3.2-11B-Vision + LoRA r=16 α=32
6 · VLM RL (optional)
GPRO-Hybrid on VLM · G=8 candidates
📊 Training Progress
Overall DDIStage 1/4
Generator SFT — Epoch 1/372%
Loss0.342
LR Warmup100%
💾 VRAM Monitor
4.1/ 8.0 GB51% used
🟣 Mistral-7B 4.1 GB
🟠 LoRA 0.6 GB
⬜ Free 3.3 GB
🔧 GPRO-Hybrid Config
α + γ must = 1.0 · current: α=0.7 + γ=0.3 = 1.0 ✓
🪵 Training Log
14:23:11 [INFO] Epoch 1 step 144/200 — loss=0.342 lr=2.1e-4
14:23:08 [INFO] Gradient norm: 0.84
14:22:55 [OK] LoRA adapters loaded (r=16 α=32)
14:22:50 [INFO] Mistral-7B loaded (4-bit quant, 4.1 GB)
14:22:45 [INFO] Stage 1: Generator SFT started
📥 Source Documents
📄
Drop files here or click to browsePDF · DOCX · HTML · LaTeX · Code · Images
advanced_ml_textbook.pdf
research_papers/ (12 files)
⚙️ Pipeline Config
🔄 Pipeline Progress
Running
Phase 1
Extract
✓ 847 chunks
Phase 2A
Enrich
✓ 5-level ctx
Phase 2B
Multimodal
✓ Unloaded
Phase 3A
Generate
⟳ 721/1200
Phase 3C
Validate
Waiting...
Phase 3A — GPRO-Hybrid Generation60.1%
Generating MCQ candidate 3/4 for chunk 180/847 · VRAM: 5.2 GB / 8.0 GB
📊 Live Counts
721
Generated
0
Review Queue
0
Rejected
479
Remaining
💾 VRAM Live
5.2/ 8.0 GB
🟣 Mistral 4.1 · 🟢 LoRA+Clip 1.1 · Free 2.8
👁️ Live Record Preview
What principle underlies token-level advantage computation in GPRO?
Which components receive individual reward signals in GPRO-Hybrid?
01 — Hard Exit
Schema
1200
−0 failed
02 — Quota
Distribution
1198
−2 over-quota
03 — Dedupe
Dedup
1151
−47 near-dup
04 — Factual
Grounding
1129
−22 halluc.
05 — Diversity
Novelty
1087
−42 too sim.
🚦 Audit Gate Summary
1087
Auto Approve
≥ 90%
42
Review Queue
70–89%
71
Auto Reject
< 70%
🔏 Provenance & Manifest
"__provenance__": {
"recipe_id": "section_1A_jogg",
"batch_id": "uuid-4a3f...",
"generator_version": "0.1.0",
"spec_hash": "sha256:b1f8ae..."
}
1,087
Total Records
8
Difficulty Layers
93.4%
Avg Quality Score
What principle underlies token-level advantage computation in GPRO?
Which components receive individual reward signals in GPRO-Hybrid?
How does CLIP image encoding produce 512-dimensional vectors?
Showing 3 of 1,087 records · Load more
🤖 Model Registry — 4 Registered
| Model | Type | Role | Status | VRAM | Checkpoint | Actions |
|---|---|---|---|---|---|---|
| mistral-7b-instruct-v0.2 | LLM | Generator | Trained | 4.1 GB | checkpoints/gen_sft_v1/ | |
| qwen2-1.5b | LLM | Classifier | Trained | 0.9 GB | checkpoints/cls_sft_v1/ | |
| llama-3.2-11b-vision | VLM | VLM / VQA | Available | 6.5 GB | models/llama-3.2/ | |
| florence-2-base | Captioner | Image Captioning | Available | 0.3 GB | models/florence-2/ |
📄 Built-in Recipes
section_1A_jogg.yaml
section_2B_jogg.yaml
section_3A_jogg_mini.yaml
multimodal_vqa.yaml
reward_training.yaml
rl_contexts.yaml
classification.yaml
⚙️ section_1A_jogg.yaml
# Recipe: section_1A_jogg.yaml
recipe_id: section_1A_jogg
format: jogg
layer_quota:
"0": 50 "1": 80 "2": 100
dedup_threshold: 0.85
grounding_threshold: 0.70
novelty_threshold: 0.60
auto_approve_threshold: 0.90
require_review_threshold: 0.70
recipe_id: section_1A_jogg
format: jogg
layer_quota:
"0": 50 "1": 80 "2": 100
dedup_threshold: 0.85
grounding_threshold: 0.70
novelty_threshold: 0.60
auto_approve_threshold: 0.90
require_review_threshold: 0.70
📤 Export to MokingbirdRAG
→ RAG🔬 Export to MokingbirdFT
→ FT🎨 Theme
🌑
Dark Pro
🟢
Matrix
☀️
Light
🖥️ Hardware
GB — prevents OOM, throttles phases
📁 Config Paths
🔗 MokingbirdNode IPC
RAG: Connected
FT: Idle
📋 Activity Log
14:22:01 [OK] PDI run complete: 1087 accepted, 42 review
14:21:58 [INFO] Phase 3C: All 5 validators passed
14:18:33 [INFO] Phase 3A: GPRO RL K=4 complete — 1200 records
14:05:12 [INFO] Phase 2B: CLIP+Florence unloaded (freed 2.1 GB)
13:58:44 [INFO] Phase 1: 12 PDFs extracted — 847 chunks, 34 images
13:55:00 [OK] DDI complete: Generator SFT + Classifier SFT trained
13:42:15 [WARN] VRAM 87% — CLIP deferred to after Phase 1
13:40:00 [INFO] App started — config loaded, models available