This repository accompanies Elementary, My Dear Watson: Non-Invasive Neural Keyword Spotting in the LibriBrain Dataset by Gereon Elvers, Gilad Landau, and Oiwi Parker Jones. It contains the tutorial, experiment notebooks, and supporting assets used to demonstrate non-invasive neural keyword spotting on the LibriBrain MEG corpus.
tutorial/– Colab-ready walkthrough that trains a keyword spotter on a 10% LibriBrain subset within one hour on a T4 GPU.experiments/– Reproducibility notebooks covering scaling, buffer length, keyword vocabulary, and result cleanup (seeexperiments/README.md).
- Python: Python 3.10 or later is recommended. Use a dedicated virtual environment (e.g.,
python -m venv .venv && source .venv/bin/activate). - PNPL library: Keyword spotting support now ships with the main PNPL package—no fork is required. Install the latest release directly from the existing repository:
pip install "git+https://github.com/neural-processing-lab/pnpl.git" - LibriBrain dataset: Will be automatically downloaded using the PNPL package. Alternatively, download it manually here.
- Experiment logging (optional): Some experiment notebooks log to Neptune. Export
NEPTUNE_API_TOKENandNEPTUNE_PROJECTbefore running if you wish to capture metrics (seeexperiments/README.md).
The PNPL LibriBrainWord dataset object offers full-signal, single-keyword, and multi-keyword configurations. After installing PNPL, the following examples cover common use cases:
from pnpl.datasets import LibriBrainWord
dataset = LibriBrainWord(
data_path="./data/",
partition="train",
tmin=0.0,
tmax=0.8,
)from pnpl.datasets import LibriBrainWord
dataset = LibriBrainWord(
data_path="./data/",
partition="train",
keyword_detection="watson",
)from pnpl.datasets import LibriBrainWord
dataset = LibriBrainWord(
data_path="./data/",
partition="train",
keyword_detection=["sherlock", "holmes"],
)- For keyword detection, the window length adapts to the longest keyword; extend with
positive_buffer/negative_bufferor override viatminandtmax. - Full signal-to-word mode keeps sensible window defaults by disabling
tminandtmaxoverrides unless explicitly provided. - Keyword-aware splits validate that requested keywords exist and fall back to the sessions with the highest prevalence for validation/test partitions.
If this work helps your research, please cite:
@inproceedings{elvers2025elementary,
title = {Elementary, My Dear Watson: Non-Invasive Neural Keyword Spotting in the LibriBrain Dataset},
author = {Elvers, Gereon and Landau, Gilad and Parker Jones, Oiwi},
booktitle = {Data on the Brain \& Mind Workshop at NeurIPS 2025},
year = {2025},
url = {https://data-brain-mind.github.io/},
}For questions or collaboration opportunities, please open an issue or contact the authors through the Neural Processing Lab.