ADR-129: RuvLTRA Training Pipeline — All Phases Executing by ruvnet · Pull Request #314 · ruvnet/RuVector

ruvnet · 2026-03-28T14:53:49Z

Summary

Updates ADR-129 status — all 4 phases of the RuvLTRA training pipeline are now executing on Google Cloud.

What's Been Done This Session

Phase 1: Calibration (Complete)

All 4 models calibrated on L4 GPU (24GB VRAM)
TurboQuant sidecar configs (default.turboquant.json) uploaded to all HuggingFace repos
Benchmark results uploaded: 75.4 tok/s (small), 62.6 tok/s (medium), 67.1 tok/s (claude-code)
HuggingFace model READMEs updated with benchmark tables

Phase 2: SFT Training (Executing)

LoRA SFT running on L4 GPU (rank-16, 2 epochs, lr=2e-5)
Training corpus: 230 records, 530K tokens (98 brain + 131 ADR + 1 routing)

Phase 3: Benchmarks (Executing)

L4 GPU benchmark job running
Release gate automation (7 gates) tested and working

Phase 4: Publishing (Complete)

TurboQuant configs uploaded to all 4 HF models
Benchmark results uploaded to all 4 HF models
Model card READMEs updated with benchmark tables

Infrastructure Deployed

Resource	Status
Docker image `gcr.io/ruv-dev/ruvltra-training:latest`	Built (torch 2.5.1+cu124)
`ruvltra-calibration` Cloud Run Job	Deployed, 4 executions complete
`ruvltra-nightly-train` Cloud Run Job	Deployed, SFT executing
`ruvltra-benchmark` Cloud Run Job	Deployed, executing
Nightly scheduler (03:00 UTC)	Enabled
Weekly benchmark scheduler (Mon 06:00 UTC)	Enabled

Previous Commits on Main

All implementation was done on main during this session:

Training scripts (10 files, ~2,700 lines)
turboquant_profile.rs (Rust sidecar loading)
Docker image (4 build iterations, final: prebuilt wheels, libgomp)
Training corpus (230 records exported and committed)
ADR-129 (governance, release gates, ablation matrix, rollback plan, serving plan, nightly loop)

Ref: #310

🤖 Generated with claude-flow

…complete - Phase 1 Calibration: Complete (all 4 models, benchmarks uploaded to HF) - Phase 2 SFT: Executing on L4 GPU (rank-16, 2 epochs) - Phase 3 Benchmarks: Executing (release gates + L4 benchmark job) - Phase 4 Publishing: Complete (TQ configs + benchmarks + README updates on HF) Benchmark results (L4 GPU): - ruvltra-small: 75.4 tok/s - ruvltra-medium: 62.6 tok/s - ruvltra-claude-code: 67.1 tok/s Co-Authored-By: claude-flow <ruv@ruv.net>

Add Continuous Training & Optimization section (ADR-129) to the capabilities table: nightly training, 7-gate release checks, TurboQuant profiling, training corpus. Co-Authored-By: claude-flow <ruv@ruv.net>

The SFT job failed because merged_corpus.jsonl was not in the Docker image. Copy it to scripts/training/data/training/ so it's included in the COPY . /app/ step. Co-Authored-By: claude-flow <ruv@ruv.net>

The training corpus uses a flat 'text' field (brain memories, ADRs) rather than chat messages or Alpaca instruction format. Add handler that converts raw text to completion-style messages for SFT. Co-Authored-By: claude-flow <ruv@ruv.net>

ruvnet added 4 commits March 28, 2026 14:53

docs: add training pipeline and release gates to root README

9ec08f7

Add Continuous Training & Optimization section (ADR-129) to the capabilities table: nightly training, 7-gate release checks, TurboQuant profiling, training corpus. Co-Authored-By: claude-flow <ruv@ruv.net>

fix(training): include training corpus in Docker build context

7de5586

The SFT job failed because merged_corpus.jsonl was not in the Docker image. Copy it to scripts/training/data/training/ so it's included in the COPY . /app/ step. Co-Authored-By: claude-flow <ruv@ruv.net>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ADR-129: RuvLTRA Training Pipeline — All Phases Executing#314

ADR-129: RuvLTRA Training Pipeline — All Phases Executing#314
ruvnet wants to merge 4 commits intomainfrom
feat/adr-129-training-pipeline

ruvnet commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ruvnet commented Mar 28, 2026

Summary

What's Been Done This Session

Phase 1: Calibration (Complete)

Phase 2: SFT Training (Executing)

Phase 3: Benchmarks (Executing)

Phase 4: Publishing (Complete)

Infrastructure Deployed

Previous Commits on Main

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant