Unlocking Non-Invasive Brain-to-Text

This repository contains the code for the paper "Unlocking Non-Invasive Brain-to-Text". Find the preprint on ArXiv here. If you find this code helpful in your work, please cite the paper:

@article{jayalath2024unlocking,
  title={{Unlocking Non-Invasive Brain-to-Text}},
  author={Jayalath, Dulhan and Landau, Gilad and Parker Jones, Oiwi},
  journal={arXiv preprint arXiv:2505.13446},
  year={2025}
}

Quick start

Install requirements with pip install -r requirements.txt.
Download all or some of the LibriBrain, Armeni, Gwilliams, and Broderick (Natural Speech) datasets.
Modify the paths in data/dataset_configs.yaml to point root to your dataset's BIDS root directory, and change cache to where you would like preprocessed data to be kept.
Make sure you have a weights and biases account and are signed in on your console.

Note: although we use sensor position data for LibriBrain in our experiments, this is not currently publicly released. To train with LibriBrain without this data, you must provide --har_type gating otherwise training will silently fail. This uses dataset-conditional linear convolutions rather than a spatial attention module. Performance should be similar.

Training a model

All results during training and evaluation will be logged to a project called word-to-sent in your weights and biases account.

Train a word predictor

Single dataset: python train.py --vocab_size 250 --dset libribrain

Joint training: python train.py --vocab_size 250 --dset gwilliams2022 --aux_dsets armeni2022 libribrain

Train a word predictor and evaluate with rescoring

python train.py --vocab_size 250 --dset libribrain --post_proc --no_llm_api

Train a word predictor and evaluate with both rescoring and in-context methods

If you want to use in-context LLM API methods, please register on the Anthropic console and set the environment variable ANTHROPIC_API_KEY with your API key, ensuring that you have sufficient credits. We estimate that you will need around $15-$20 to evaluate a dataset in full (less if you use --limit_eval_samples).

python train.py --vocab_size 250 --dset libribrain --post_proc

Evaluate with random noise inputs to trained model

python train.py --vocab_size 250 --dset libribrain --test_ckpt /path/to/trained/ckpt --random_noise_inputs

Usage tips

Use --limit_eval_samples <int> to reduce the number of sentences you evaluate with to save API costs.
Use a trained word predictor for evaluation by supplying ---test_ckpt /path/to/ckpt (checkpoints are saved to ./checkpoints/* automatically).
Use --name <run-name> to change the name of the run logged to weights and biases.
Use --predict_oov if you want to train and use an OOV position predictor during evaluation.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
model		model
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
train.py		train.py
train_utils.py		train_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unlocking Non-Invasive Brain-to-Text

Quick start

Training a model

Train a word predictor

Train a word predictor and evaluate with rescoring

Train a word predictor and evaluate with both rescoring and in-context methods

Evaluate with random noise inputs to trained model

Usage tips

About

Uh oh!

Releases

Packages

Languages

License

neural-processing-lab/unlocking-b2t

Folders and files

Latest commit

History

Repository files navigation

Unlocking Non-Invasive Brain-to-Text

Quick start

Training a model

Train a word predictor

Train a word predictor and evaluate with rescoring

Train a word predictor and evaluate with both rescoring and in-context methods

Evaluate with random noise inputs to trained model

Usage tips

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages