Subjective Depth & Timescale Transformers

This repository contains the code for "Subjective Depth & Timescale Transformers," a research project implementing and evaluating dynamic transformer architectures. It includes two novel models, the Subjective Depth Transformer (SDT) and the Subjective Timescale Transformer (STT), which leverage Bayesian surprise signals to dynamically route computation, learning where and when to compute.

The framework is built on PyTorch, Hugging Face transformers, and accelerate, with configuration managed by hydra.

Setup

Clone the repository. The training scripts will handle the rest of the environment setup automatically.

git clone https://github.com/your-username/dynamic-transformers.git
cd dynamic-transformers

Training & Evaluation

The easiest way to run training is using the provided scripts. They will automatically set up a Python virtual environment with uv, install dependencies, and launch the training run. Evaluation is automatically performed at the end of training, with results saved to eval_results.json in the model's output directory.

On a macOS Laptop (CPU/MPS)

For local development and debugging, use train_mac.sh. This script runs a small-scale training job using the laptop.yaml configuration.

chmod +x train_mac.sh
./train_mac.sh

On a GPU Server (with SLURM)

For full-scale training on a SLURM cluster, use train_gpu.sh. This script submits a job to the cluster and uses accelerate for distributed training.

chmod +x train_gpu.sh
sbatch train_gpu.sh

You can customize the run by editing the script or by setting environment variables. For example, to log the run to Weights & Biases, set the WANDB_RUN variable:

WANDB_RUN=my-awesome-experiment sbatch train_gpu.sh

The model type (e.g., stt, sdt, mod) and other parameters can be modified directly within the train_gpu.sh script.

Name		Name	Last commit message	Last commit date
Latest commit History 830 Commits
config		config
outputs		outputs
src		src
.gitignore		.gitignore
README.md		README.md
bench.py		bench.py
benchmark_gpu.sh		benchmark_gpu.sh
performance_benchmark.py		performance_benchmark.py
pyproject.toml		pyproject.toml
run_scaling_experiments.sh		run_scaling_experiments.sh
run_sdt_stt_0.5B.sh		run_sdt_stt_0.5B.sh
test_pipeline.sh		test_pipeline.sh
train.py		train.py
train_gpu.sh		train_gpu.sh
train_mac.sh		train_mac.sh
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Subjective Depth & Timescale Transformers

Setup

Training & Evaluation

On a macOS Laptop (CPU/MPS)

On a GPU Server (with SLURM)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Subjective Depth & Timescale Transformers

Setup

Training & Evaluation

On a macOS Laptop (CPU/MPS)

On a GPU Server (with SLURM)

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages