Add experiments/mnist VPD family (memorization-vs-generalization study) by lee-goodfire · Pull Request #890 · goodfire-ai/param-decomp

lee-goodfire · 2026-06-24T08:20:20Z

Adds a reusable MNIST MLP experiment family for adVersarial Parameter Decomposition (VPD), mirroring experiments/resid_mlp, used for the memorization-vs-generalization decomposition study (Silico issue 6).

What's added

param_decomp_lab/experiments/mnist/
- models.py — MnistMLP (named fc_in/fc_h.*/fc_out Linears, GELU) + train config + MnistTargetRunInfo
- data.py — raw-tensor MNIST loader, deterministic label-corruption/subsample builder, full-batch memorized-set iterator (drops the partial final batch so VPD's per-datapoint persistent-PGD adversary stays batch-uniform)
- train_mnist.py — pd-mnist-pretrain CLI (label-noise sweep + size ladder)
- run.py — pd-mnist CLI, categorical KL reconstruction path (recon_loss_kl) + run_batch_first_element, SavedMnistRun
pd-mnist-pretrain / pd-mnist entry points in param_decomp_lab/pyproject.toml
Registered the existing generic UnmaskedReconLoss as a YAML eval metric (gives the kl_unmasked faithfulness check on categorical, non-LM targets; CEandKLLosses is LM-only)

Result (Silico issue 6)

At matched decomposition faithfulness, a pure memorizer decomposes into ~130x more live components (and ~240x more per input) than a generalizer, but the components are distributed/redundant, not per-example (density, ablation, and specimen evidence).

🤖 Generated with Claude Code

…decomposition glue) New reusable experiment family mirroring resid_mlp for the MNIST memorization-vs-generalization study: - models.py: MnistMLP (fc_in/fc_h.*/fc_out, GELU) + train config + run info - data.py: raw MNIST loader, deterministic label-corruption/subsample builder, infinite memorized-set batch iterator - train_mnist.py: pd-mnist-pretrain CLI (label-noise sweep + size ladder) - run.py: pd-mnist CLI, categorical KL recon path (recon_loss_kl) + run_batch_first_element - register UnmaskedReconLoss as a YAML eval metric (generic kl_unmasked check; CEandKLLosses is LM-only) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ay batch-uniform Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

lee-goodfire and others added 2 commits June 24, 2026 02:03

mnist data: drop partial final batch so VPD persistent-PGD sources st…

4b821e0

…ay batch-uniform Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add experiments/mnist VPD family (memorization-vs-generalization study)#890

Add experiments/mnist VPD family (memorization-vs-generalization study)#890
lee-goodfire wants to merge 2 commits into
feature/silico-integrationfrom
silico/mnist-vpd-memorization

lee-goodfire commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

lee-goodfire commented Jun 24, 2026

What's added

Result (Silico issue 6)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant