Skip to content

CLI isn't overriding settings in the baseline.toml file #10

@isaac-w-dev

Description

@isaac-w-dev

Training jobs are still calculating means even though they are set to False in the CLI.

CLI Command:

uv run python -m saev train 
--sweep configs/preprint/baseline.toml 
--data.shard-root /fs/scratch/PAS2136/oh-scipe/activations/406094ac63e6de7a592d5ddbaa581e147d51a2596e39732dafe2171cfe15225b
--data.layer -1
--data.patches patches
--data.scale-mean False
--data.scale-norm False sae:relu
--sae.d-vit 768

baseline.toml file contents:

tag = "baseline-v4.8"

lr = [3e-4, 1e-3, 3e-3]

n_lr_warmup = 500
n_sparsity_warmup = 500

[sae]
normalize_w_dec = true
remove_parallel_grads = true
exp_factor = [16, 32]

[objective]
sparsity_coeff = [4e-4, 8e-4, 1.6e-3]

[data]
scale_mean = true
scale_norm = true

An example of this working correctly is in the python formatter 'black' where default options can be overridden in the CLI:
https://github.com/psf/black/blob/main/pyproject.toml

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions