Submission: 12L Int5-MLP BigramHash10K EMA (1.1476 BPB) by Skytuhua · Pull Request #592 · openai/parameter-golf

Skytuhua · 2026-03-24T01:10:26Z

Summary

12-layer GPT with mixed Int5/Int6 quantization (MLP=Int5, Attn=Int6)
BigramHash(10240) expanded embeddings
EMA(0.997) + SWA + GPTQ-lite clip search + late QAT
SmearGate, OrthoInit, XSA last 4 layers, partial RoPE (16 dims)
Sliding window eval stride=64
Built on PR#414 SOTA stack

Results

Metric	Value
val_bpb (sliding window s64)	1.14760365
val_loss	1.93767556
Artifact size	15,497,769 bytes
Training steps	4973 (600s wallclock cap)
Hardware	8x H100 SXM
Seed	1337

Test plan

Full training run on 8xH100 SXM (600s wallclock)
Int6+zstd quantization roundtrip verified
Sliding window evaluation (stride=64)
Artifact under 16MB limit (15.5MB)

12-layer GPT with mixed Int5/Int6 quantization, BigramHash(10240), EMA(0.997), GPTQ-lite clip search, late QAT, SmearGate, XSA last 4, partial RoPE, sliding window eval stride=64. Built on PR#414 SOTA stack. val_bpb: 1.14760365 (sliding window s64) artifact: 15,497,769 bytes (under 16MB) trained on 8xH100 SXM, seed 1337

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Submission: 12L Int5-MLP BigramHash10K EMA (1.1476 BPB)#592

Submission: 12L Int5-MLP BigramHash10K EMA (1.1476 BPB)#592
Skytuhua wants to merge 1 commit intoopenai:mainfrom
Skytuhua:submission/12L-int5mlp-bigramhash10k-ema-1.1476

Skytuhua commented Mar 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Skytuhua commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Results

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Skytuhua commented Mar 24, 2026 •

edited

Loading