This repository contains the code for the paper "Unlocking Non-Invasive Brain-to-Text". Find the preprint on ArXiv here. If you find this code helpful in your work, please cite the paper:
@article{jayalath2024unlocking,
title={{Unlocking Non-Invasive Brain-to-Text}},
author={Jayalath, Dulhan and Landau, Gilad and Parker Jones, Oiwi},
journal={arXiv preprint arXiv:2505.13446},
year={2025}
}
- Install requirements with
pip install -r requirements.txt. - Download all or some of the LibriBrain, Armeni, Gwilliams, and Broderick (Natural Speech) datasets.
- Modify the paths in
data/dataset_configs.yamlto pointrootto your dataset's BIDS root directory, and changecacheto where you would like preprocessed data to be kept. - Make sure you have a weights and biases account and are signed in on your console.
Note: although we use sensor position data for LibriBrain in our experiments, this is not currently publicly released. To train with LibriBrain without this data, you must provide
--har_type gatingotherwise training will silently fail. This uses dataset-conditional linear convolutions rather than a spatial attention module. Performance should be similar.
All results during training and evaluation will be logged to a project called word-to-sent in your weights and biases account.
Single dataset:
python train.py --vocab_size 250 --dset libribrain
Joint training: python train.py --vocab_size 250 --dset gwilliams2022 --aux_dsets armeni2022 libribrain
python train.py --vocab_size 250 --dset libribrain --post_proc --no_llm_api
If you want to use in-context LLM API methods, please register on the Anthropic console and set the environment variable
ANTHROPIC_API_KEYwith your API key, ensuring that you have sufficient credits. We estimate that you will need around$15-$20to evaluate a dataset in full (less if you use--limit_eval_samples).
python train.py --vocab_size 250 --dset libribrain --post_proc
python train.py --vocab_size 250 --dset libribrain --test_ckpt /path/to/trained/ckpt --random_noise_inputs
- Use
--limit_eval_samples <int>to reduce the number of sentences you evaluate with to save API costs. - Use a trained word predictor for evaluation by supplying
---test_ckpt /path/to/ckpt(checkpoints are saved to./checkpoints/*automatically). - Use
--name <run-name>to change the name of the run logged to weights and biases. - Use
--predict_oovif you want to train and use an OOV position predictor during evaluation.